sparse-moe
1 article tagged with sparse-moe
April 28, 2026
model releaseXiaomi
Xiaomi releases MiMo-V2.5: 310B parameter omnimodal model with 1M token context window
Xiaomi released MiMo-V2.5, a 310B total parameter sparse mixture-of-experts model that activates 15B parameters per token. The omnimodal model supports text, image, video, and audio understanding with a 1M token context window and was trained on 48T tokens using FP8 mixed precision.