Mixture of Experts

3 articles tagged with Mixture of Experts

April 27, 2026
model release

Alibaba Qwen Releases 35B Sparse MoE Model with 262K Context and Multimodal Support

Alibaba Cloud has released Qwen3.6-35B-A3B, an open-weight sparse mixture-of-experts model with 35 billion total parameters but only 3 billion active parameters per token. The model features a 262K native context window (expandable to 1M tokens), multimodal input support, and integrated reasoning mode with preserved thinking traces.

April 22, 2026
model releaseArcee Ai

Arcee AI Releases Trinity Large Preview: 400B-Parameter MoE Model with 512K Context Window

Arcee AI has released Trinity Large Preview, a 400B-parameter sparse Mixture-of-Experts model with 13B active parameters per token using 4-of-256 expert routing. The model supports context windows up to 512K tokens and is available with open weights under permissive licensing.

April 17, 2026
model release

Alibaba Qwen Releases 35B Parameter Qwen3.6-35B-A3B Model with 262K Native Context Window

Alibaba Qwen has released Qwen3.6-35B-A3B, a 35-billion parameter mixture-of-experts model with 3 billion activated parameters and a 262,144-token native context window extendable to 1,010,000 tokens. The model scores 73.4 on SWE-bench Verified and features FP8 quantization with performance metrics nearly identical to the original model.