EquiRoute technical report / preprint. Sparse mixture-of-experts (MoE) routing improves model capacity while keeping per-token computation manageable. In multimodal large language models (MLLMs), routing is performed over a shared token stream whose modality composition is highly imbalanced: visual inputs may contribute hundreds of tokens, whereas paired text prompts contribute only a few dozen. Under this asymmetry, standard token-level load balancing is dominated by the majority modality, yielding aggregate expert utilization that appears balanced while masking modality-specific inequity. This paper studies modality-equitable routing in shared sparse MoE layers and proposes EquiRoute, a lightweight routing framework with three components: Entropy-Guided Token Budgeting (ETB), which uses modality-level routing entropy to modulate dispatch pressure; Cross-Modality Expert Reservation (CER), which introduces soft capacity targets that reduce minority-modality starvation; and Modality-Contrastive Routing (MCR), which encourages distinct modality-level routing distributions while retaining a shared expert pool. The Modality Equity Index (MEI) is defined as a diagnostic for cross-modality similarity in expert access. In a 16-expert two-modality setting, increasing the vision:text token ratio from 4:1 to 32:1 under standard routing reduces text share among the four most loaded experts from 0.200 to 0.029 and decreases MEI from 0.875 to 0.250. Under the same synthetic setup, EquiRoute increases MEI to 0.938, 0.875, 0.750, and 0.625 at ratios 4:1, 8:1, 16:1, and 32:1, respectively. These controlled results quantify the scale of imbalance motivating the method. EquiRoute is positioned relative to prior sparse MoE and multimodal routing work, with a fully specified training objective, diagnostics, and evaluation protocol provided. Existing OSF archival DOI: 10.17605/OSF.IO/HFVBC; Existing OSF archival page: https://osf.io/hfvbc/. Files include the technical report PDF and the LaTeX source tarball when available.
Building similarity graph...
Analyzing shared references across papers
Loading...
Haopeng Jin
Beijing University of Posts and Telecommunications
Building similarity graph...
Analyzing shared references across papers
Loading...
Haopeng Jin (Mon,) studied this question.
www.synapsesocial.com/papers/69ec5b8a88ba6daa22dad0d5 — DOI: https://doi.org/10.5281/zenodo.19712482