omnimodal

2 articles tagged with omnimodal

April 22, 2026
model releaseXiaomi+1

Xiaomi Launches MiMo-V2.5 With 1M Context Window at $0.40 per Million Input Tokens

Xiaomi released MiMo-V2.5 on April 22, 2026, a native omnimodal model with a 1,048,576 token context window. The model is priced at $0.40 per million input tokens and $2 per million output tokens, positioning it as a cost-efficient alternative for agentic applications requiring multimodal perception across image and video understanding.

March 31, 2026
model release+1

Alibaba's Qwen3.5-Omni learns to write code from speech and video without explicit training

Alibaba has released Qwen3.5-Omni, an omnimodal model handling text, images, audio, and video with a 256,000-token context window. The model reportedly outperforms Google's Gemini 3.1 Pro on audio tasks with support for 74 languages in speech recognition, a 6x increase from its predecessor. An unexpected emergent capability: writing working code from spoken instructions and video input, which the team did not explicitly train.