model release

Alibaba Releases Qwen3.6 Max Preview: 1 Trillion Parameter MoE Model With 262K Context Window

TL;DR

Alibaba Cloud has released Qwen3.6 Max Preview, a proprietary frontier model built on sparse mixture-of-experts architecture with approximately 1 trillion total parameters. The model supports a 262,144-token context window and features integrated thinking mode for multi-turn reasoning, priced at $1.30 per million input tokens and $7.80 per million output tokens.

2 min read
0

Qwen3.6 Max Preview — Quick Specs

Context window262K tokens
Input$1.3/1M tokens
Output$7.8/1M tokens

Alibaba Releases Qwen3.6 Max Preview: 1 Trillion Parameter MoE Model With 262K Context Window

Alibaba Cloud has released Qwen3.6 Max Preview, a proprietary frontier model with approximately 1 trillion total parameters using a sparse mixture-of-experts (MoE) architecture. The model supports a 262,144-token context window and is priced at $1.30 per million input tokens and $7.80 per million output tokens.

Technical Specifications

According to Alibaba, Qwen3.6 Max Preview is optimized for agentic coding, tool use, and long-context reasoning. The model includes an integrated thinking mode that preserves reasoning traces across multi-turn conversations, similar to approaches seen in other reasoning-capable models.

The model supports structured output and function calling capabilities. The sparse MoE architecture activates only a subset of the 1 trillion total parameters for each inference, which typically provides efficiency advantages over dense models of equivalent size.

Availability and Access

Qwen3.6 Max Preview is available exclusively through the Alibaba Cloud Model Studio and Qwen Studio APIs. No open weights are provided. The model is also accessible through OpenRouter, which routes requests across multiple providers.

The model was released on April 27, 2025, according to the listing on OpenRouter.

Pricing Structure

  • Input tokens: $1.30 per million tokens
  • Output tokens: $7.80 per million tokens

The 6:1 ratio between output and input pricing is consistent with cost structures for compute-intensive inference in large language models.

Reasoning Capabilities

The integrated thinking mode allows the model to show step-by-step reasoning processes. According to OpenRouter's documentation, developers can access reasoning traces through a reasoning_details array in API responses. The model can preserve these reasoning traces when passed back in subsequent conversation turns, enabling continuous reasoning across multi-turn interactions.

What This Means

Qwen3.6 Max Preview represents Alibaba's entry into the trillion-parameter model tier, joining other frontier models in supporting extended context windows beyond 200K tokens. The sparse MoE architecture suggests a focus on inference efficiency, though Alibaba has not disclosed the number of active parameters per forward pass. The proprietary, API-only release contrasts with Alibaba's previous pattern of releasing open-weight Qwen models, indicating a strategic shift toward commercial model offerings. The emphasis on agentic coding and tool use positions this model for enterprise workflows requiring autonomous task execution.

Related Articles

model release

Alibaba's Qwen Team Releases Qwen3.6 27B With 262K Context Window and Video Processing

Alibaba's Qwen Team has released Qwen3.6 27B, a 27-billion parameter multimodal language model with a 262,144-token context window. The model accepts text, image, and video inputs and includes a built-in thinking mode for extended reasoning, with pricing at $0.195 per million input tokens and $1.56 per million output tokens.

model release

DeepSeek Releases V4-Flash: 284B-Parameter MoE Model With 1M Token Context at 27% Inference Cost

DeepSeek released two Mixture-of-Experts models: V4-Flash with 284B total parameters (13B activated) and V4-Pro with 1.6T parameters (49B activated). Both models support one million token context windows and use a hybrid attention architecture that requires only 27% of the inference FLOPs compared to DeepSeek-V3.2 at 1M token context.

model release

DeepSeek V4 Flash Released: 284B Parameter MoE Model with 1M Context Window at $0.14 per Million Tokens

DeepSeek has released V4 Flash, a Mixture-of-Experts model with 284B total parameters and 13B activated parameters per request. The model supports a 1,048,576-token context window and is priced at $0.14 per million input tokens and $0.28 per million output tokens.

model release

DeepSeek Releases V4-Pro: 1.6T Parameter MoE Model with 1M Token Context

DeepSeek released two new Mixture-of-Experts models: DeepSeek-V4-Pro with 1.6 trillion parameters (49B activated) and DeepSeek-V4-Flash with 284B parameters (13B activated), both supporting one million token context length. The models achieve 27% of inference FLOPs and 10% of KV cache compared to DeepSeek-V3.2 at 1M context through a hybrid attention architecture combining Compressed Sparse Attention and Heavily Compressed Attention.

Comments

Loading...