model release

Alibaba Qwen Releases Qwen3.6 Flash with 1M Context Window at $0.25 per 1M Input Tokens

TL;DR

Alibaba's Qwen team has released Qwen3.6 Flash, a multimodal language model supporting text, image, and video input with a 1 million token context window. The model is priced at $0.25 per 1M input tokens and $1.50 per 1M output tokens, with tiered pricing above 256K tokens.

April 27, 2026 · 4:35 AM2 min read

Qwen3.6 Flash — Quick Specs

Context window1000K tokens

Input$0.25/1M tokens

Output$1.5/1M tokens

Compare Qwen3.6 Flash with other models →

Alibaba Qwen Releases Qwen3.6 Flash with 1M Context Window

Alibaba's Qwen team has released Qwen3.6 Flash, a multimodal language model that processes text, image, and video inputs with a 1 million token context window. Released on April 27, 2026, the model is positioned as a fast, efficient option in the Qwen 3.6 series.

Pricing and Technical Specifications

The model is priced at $0.25 per 1 million input tokens and $1.50 per 1 million output tokens for prompts up to 256K tokens. According to the release information, tiered pricing applies for requests exceeding 256K tokens, though specific rates for higher tiers were not disclosed.

The 1M token context window places Qwen3.6 Flash among models with extended context capabilities, though it falls short of some competitors offering 2M+ token windows. The model supports prompt caching with separate pricing for cache read and cache creation operations.

Multimodal Capabilities

Qwen3.6 Flash handles three input modalities: text, images, and video. This positions it as a general-purpose multimodal model, though specific benchmark scores and performance metrics were not provided in the release announcement.

The model is available through OpenRouter, which routes requests across multiple providers with automatic fallback for uptime optimization. OpenRouter's implementation supports reasoning-enabled features, allowing the model to display step-by-step thinking processes through a reasoning_details array in API responses.

API Integration

Developers can access Qwen3.6 Flash through OpenRouter's normalized API, which maintains compatibility with OpenAI SDK conventions. The platform provides request routing to optimize for prompt size and parameters, with provider fallbacks to maintain service availability.

What This Means

Qwen3.6 Flash represents Alibaba's continued push into competitive AI model pricing while expanding multimodal capabilities. The $0.25 per 1M input tokens rate undercuts several major competitors, though direct performance comparisons remain unclear without published benchmark scores. The tiered pricing structure for larger contexts suggests the model is optimized for shorter interactions, with the 256K threshold marking a significant cost increase point. Video input support is notable, as this capability remains relatively uncommon among broadly available language models.

Source: openrouter.ai ↗

Qwen Alibaba multimodal context-window pricing model-release video-input

model releaseApril 27, 2026

Alibaba Qwen Releases 35B Sparse MoE Model with 262K Context and Multimodal Support

Alibaba Cloud has released Qwen3.6-35B-A3B, an open-weight sparse mixture-of-experts model with 35 billion total parameters but only 3 billion active parameters per token. The model features a 262K native context window (expandable to 1M tokens), multimodal input support, and integrated reasoning mode with preserved thinking traces.

model releaseApril 27, 2026

Alibaba's Qwen Team Releases Qwen3.6 27B With 262K Context Window and Video Processing

Alibaba's Qwen Team has released Qwen3.6 27B, a 27-billion parameter multimodal language model with a 262,144-token context window. The model accepts text, image, and video inputs and includes a built-in thinking mode for extended reasoning, with pricing at $0.195 per million input tokens and $1.56 per million output tokens.

model releaseApril 24, 2026

DeepSeek V4 Pro launches with 1.6 trillion parameters, 1M token context at $0.145 per million input tokens

Chinese AI lab DeepSeek has released preview versions of DeepSeek V4 Flash and V4 Pro, mixture-of-experts models with 1 million token context windows. The V4 Pro has 1.6 trillion total parameters (49 billion active), making it the largest open-weight model available, while both models significantly undercut frontier model pricing.

model releaseApril 27, 2026

Alibaba Releases Qwen3.6 Max Preview: 1 Trillion Parameter MoE Model With 262K Context Window

Alibaba Cloud has released Qwen3.6 Max Preview, a proprietary frontier model built on sparse mixture-of-experts architecture with approximately 1 trillion total parameters. The model supports a 262,144-token context window and features integrated thinking mode for multi-turn reasoning, priced at $1.30 per million input tokens and $7.80 per million output tokens.

Alibaba Qwen Releases Qwen3.6 Flash with 1M Context Window at $0.25 per 1M Input Tokens

Qwen3.6 Flash — Quick Specs

Alibaba Qwen Releases Qwen3.6 Flash with 1M Context Window

Pricing and Technical Specifications

Multimodal Capabilities

API Integration

What This Means

Related Articles

Alibaba Qwen Releases 35B Sparse MoE Model with 262K Context and Multimodal Support

Alibaba's Qwen Team Releases Qwen3.6 27B With 262K Context Window and Video Processing

DeepSeek V4 Pro launches with 1.6 trillion parameters, 1M token context at $0.145 per million input tokens

Alibaba Releases Qwen3.6 Max Preview: 1 Trillion Parameter MoE Model With 262K Context Window

Comments