Mixture of Experts

3 articles tagged with Mixture of Experts

April 27, 2026

model release

Alibaba Qwen Releases 35B Sparse MoE Model with 262K Context and Multimodal Support

Alibaba Cloud has released Qwen3.6-35B-A3B, an open-weight sparse mixture-of-experts model with 35 billion total parameters but only 3 billion active parameters per token. The model features a 262K native context window (expandable to 1M tokens), multimodal input support, and integrated reasoning mode with preserved thinking traces.

April 27, 2026 · 3:51 AM

April 22, 2026

model releaseArcee Ai

Arcee AI Releases Trinity Large Preview: 400B-Parameter MoE Model with 512K Context Window

Arcee AI has released Trinity Large Preview, a 400B-parameter sparse Mixture-of-Experts model with 13B active parameters per token using 4-of-256 expert routing. The model supports context windows up to 512K tokens and is available with open weights under permissive licensing.

April 22, 2026 · 4:36 PM

April 17, 2026

model release

Alibaba Qwen Releases 35B Parameter Qwen3.6-35B-A3B Model with 262K Native Context Window

Alibaba Qwen has released Qwen3.6-35B-A3B, a 35-billion parameter mixture-of-experts model with 3 billion activated parameters and a 262,144-token native context window extendable to 1,010,000 tokens. The model scores 73.4 on SWE-bench Verified and features FP8 quantization with performance metrics nearly identical to the original model.

April 17, 2026 · 6:36 AM

← Back to all news