model release

Alibaba Qwen Releases 35B Sparse MoE Model with 262K Context and Multimodal Support

TL;DR

Alibaba Cloud has released Qwen3.6-35B-A3B, an open-weight sparse mixture-of-experts model with 35 billion total parameters but only 3 billion active parameters per token. The model features a 262K native context window (expandable to 1M tokens), multimodal input support, and integrated reasoning mode with preserved thinking traces.

2 min read
0

Qwen3.6 35B A3B — Quick Specs

Context window262K tokens
Input$0.1612/1M tokens
Output$0.9653/1M tokens

Alibaba Qwen Releases 35B Sparse MoE Model with 262K Context and Multimodal Support

Alibaba Cloud has released Qwen3.6-35B-A3B, an open-weight sparse mixture-of-experts model with 35 billion total parameters but only 3 billion active parameters per token.

Architecture and Specifications

The model uses a hybrid sparse MoE architecture that combines Gated DeltaNet linear attention with standard gated attention layers, according to Alibaba. This design reduces computational requirements by activating only 3 billion parameters per token while maintaining the capacity of the full 35 billion parameter model.

Qwen3.6-35B-A3B supports a native context window of 262,144 tokens, extensible to 1 million tokens using YaRN (Yet another RoPE extensioN method). The model accepts text, image, and video inputs, making it a multimodal system.

Key Capabilities

The model includes:

  • Reasoning mode: Integrated thinking capability with reasoning traces preserved across multi-turn conversations
  • Function calling: Native support for tool use and function execution
  • Structured output: Ability to generate formatted responses
  • Multimodal processing: Handles text, images, and video inputs

Pricing and Availability

The model is available through OpenRouter at $0.1612 per million input tokens and $0.9653 per million output tokens. Alibaba has released it under the Apache 2.0 license, making the model weights freely available for commercial and research use.

The sparse MoE architecture positions Qwen3.6-35B-A3B as a cost-efficient alternative to dense models, as only 8.6% of parameters are active during inference.

What This Means

The sparse MoE approach with only 3B active parameters per token makes this 35B model competitive on inference cost with much smaller dense models while potentially retaining more knowledge capacity. The 262K native context window and multimodal capabilities make it suitable for document analysis and video understanding tasks. However, benchmark scores are not yet publicly available, making it difficult to assess performance relative to other models in its class. The Apache 2.0 license and availability through OpenRouter lower the barrier to adoption for developers seeking open-weight alternatives to proprietary models.

Related Articles

model release

Alibaba Qwen Releases Qwen3.6 Flash with 1M Context Window at $0.25 per 1M Input Tokens

Alibaba's Qwen team has released Qwen3.6 Flash, a multimodal language model supporting text, image, and video input with a 1 million token context window. The model is priced at $0.25 per 1M input tokens and $1.50 per 1M output tokens, with tiered pricing above 256K tokens.

model release

Alibaba's Qwen Team Releases Qwen3.6 27B With 262K Context Window and Video Processing

Alibaba's Qwen Team has released Qwen3.6 27B, a 27-billion parameter multimodal language model with a 262,144-token context window. The model accepts text, image, and video inputs and includes a built-in thinking mode for extended reasoning, with pricing at $0.195 per million input tokens and $1.56 per million output tokens.

model release

Alibaba Releases Qwen3.6 Max Preview: 1 Trillion Parameter MoE Model With 262K Context Window

Alibaba Cloud has released Qwen3.6 Max Preview, a proprietary frontier model built on sparse mixture-of-experts architecture with approximately 1 trillion total parameters. The model supports a 262,144-token context window and features integrated thinking mode for multi-turn reasoning, priced at $1.30 per million input tokens and $7.80 per million output tokens.

model release

OpenAI Releases GPT-5.5 Pro with 1M+ Token Context Window, $30 Per Million Input Tokens

OpenAI has released GPT-5.5 Pro, a high-capability model with a 1,050,000 token context window (922K input, 128K output) priced at $30 per million input tokens and $180 per million output tokens. The model supports text and image inputs and is optimized for deep reasoning, agentic coding, and multi-step workflows.

Comments

Loading...