Alibaba Releases Qwen3.6 Max Preview: 1 Trillion Parameter MoE Model With 262K Context Window
Alibaba Cloud has released Qwen3.6 Max Preview, a proprietary frontier model built on sparse mixture-of-experts architecture with approximately 1 trillion total parameters. The model supports a 262,144-token context window and features integrated thinking mode for multi-turn reasoning, priced at $1.30 per million input tokens and $7.80 per million output tokens.
Qwen3.6 Max Preview — Quick Specs
Alibaba Releases Qwen3.6 Max Preview: 1 Trillion Parameter MoE Model With 262K Context Window
Alibaba Cloud has released Qwen3.6 Max Preview, a proprietary frontier model with approximately 1 trillion total parameters using a sparse mixture-of-experts (MoE) architecture. The model supports a 262,144-token context window and is priced at $1.30 per million input tokens and $7.80 per million output tokens.
Technical Specifications
According to Alibaba, Qwen3.6 Max Preview is optimized for agentic coding, tool use, and long-context reasoning. The model includes an integrated thinking mode that preserves reasoning traces across multi-turn conversations, similar to approaches seen in other reasoning-capable models.
The model supports structured output and function calling capabilities. The sparse MoE architecture activates only a subset of the 1 trillion total parameters for each inference, which typically provides efficiency advantages over dense models of equivalent size.
Availability and Access
Qwen3.6 Max Preview is available exclusively through the Alibaba Cloud Model Studio and Qwen Studio APIs. No open weights are provided. The model is also accessible through OpenRouter, which routes requests across multiple providers.
The model was released on April 27, 2025, according to the listing on OpenRouter.
Pricing Structure
- Input tokens: $1.30 per million tokens
- Output tokens: $7.80 per million tokens
The 6:1 ratio between output and input pricing is consistent with cost structures for compute-intensive inference in large language models.
Reasoning Capabilities
The integrated thinking mode allows the model to show step-by-step reasoning processes. According to OpenRouter's documentation, developers can access reasoning traces through a reasoning_details array in API responses. The model can preserve these reasoning traces when passed back in subsequent conversation turns, enabling continuous reasoning across multi-turn interactions.
What This Means
Qwen3.6 Max Preview represents Alibaba's entry into the trillion-parameter model tier, joining other frontier models in supporting extended context windows beyond 200K tokens. The sparse MoE architecture suggests a focus on inference efficiency, though Alibaba has not disclosed the number of active parameters per forward pass. The proprietary, API-only release contrasts with Alibaba's previous pattern of releasing open-weight Qwen models, indicating a strategic shift toward commercial model offerings. The emphasis on agentic coding and tool use positions this model for enterprise workflows requiring autonomous task execution.
Related Articles
Alibaba's Qwen Team Releases Qwen3.6 27B With 262K Context Window and Video Processing
Alibaba's Qwen Team has released Qwen3.6 27B, a 27-billion parameter multimodal language model with a 262,144-token context window. The model accepts text, image, and video inputs and includes a built-in thinking mode for extended reasoning, with pricing at $0.195 per million input tokens and $1.56 per million output tokens.
DeepSeek Releases V4-Flash: 284B-Parameter MoE Model With 1M Token Context at 27% Inference Cost
DeepSeek released two Mixture-of-Experts models: V4-Flash with 284B total parameters (13B activated) and V4-Pro with 1.6T parameters (49B activated). Both models support one million token context windows and use a hybrid attention architecture that requires only 27% of the inference FLOPs compared to DeepSeek-V3.2 at 1M token context.
DeepSeek V4 Flash Released: 284B Parameter MoE Model with 1M Context Window at $0.14 per Million Tokens
DeepSeek has released V4 Flash, a Mixture-of-Experts model with 284B total parameters and 13B activated parameters per request. The model supports a 1,048,576-token context window and is priced at $0.14 per million input tokens and $0.28 per million output tokens.
DeepSeek Releases V4-Pro: 1.6T Parameter MoE Model with 1M Token Context
DeepSeek released two new Mixture-of-Experts models: DeepSeek-V4-Pro with 1.6 trillion parameters (49B activated) and DeepSeek-V4-Flash with 284B parameters (13B activated), both supporting one million token context length. The models achieve 27% of inference FLOPs and 10% of KV cache compared to DeepSeek-V3.2 at 1M context through a hybrid attention architecture combining Compressed Sparse Attention and Heavily Compressed Attention.
Comments
Loading...