Alibaba Qwen Releases Qwen3.6 Flash with 1M Context Window at $0.25 per 1M Input Tokens
Alibaba's Qwen team has released Qwen3.6 Flash, a multimodal language model supporting text, image, and video input with a 1 million token context window. The model is priced at $0.25 per 1M input tokens and $1.50 per 1M output tokens, with tiered pricing above 256K tokens.
Qwen3.6 Flash — Quick Specs
Alibaba Qwen Releases Qwen3.6 Flash with 1M Context Window
Alibaba's Qwen team has released Qwen3.6 Flash, a multimodal language model that processes text, image, and video inputs with a 1 million token context window. Released on April 27, 2026, the model is positioned as a fast, efficient option in the Qwen 3.6 series.
Pricing and Technical Specifications
The model is priced at $0.25 per 1 million input tokens and $1.50 per 1 million output tokens for prompts up to 256K tokens. According to the release information, tiered pricing applies for requests exceeding 256K tokens, though specific rates for higher tiers were not disclosed.
The 1M token context window places Qwen3.6 Flash among models with extended context capabilities, though it falls short of some competitors offering 2M+ token windows. The model supports prompt caching with separate pricing for cache read and cache creation operations.
Multimodal Capabilities
Qwen3.6 Flash handles three input modalities: text, images, and video. This positions it as a general-purpose multimodal model, though specific benchmark scores and performance metrics were not provided in the release announcement.
The model is available through OpenRouter, which routes requests across multiple providers with automatic fallback for uptime optimization. OpenRouter's implementation supports reasoning-enabled features, allowing the model to display step-by-step thinking processes through a reasoning_details array in API responses.
API Integration
Developers can access Qwen3.6 Flash through OpenRouter's normalized API, which maintains compatibility with OpenAI SDK conventions. The platform provides request routing to optimize for prompt size and parameters, with provider fallbacks to maintain service availability.
What This Means
Qwen3.6 Flash represents Alibaba's continued push into competitive AI model pricing while expanding multimodal capabilities. The $0.25 per 1M input tokens rate undercuts several major competitors, though direct performance comparisons remain unclear without published benchmark scores. The tiered pricing structure for larger contexts suggests the model is optimized for shorter interactions, with the 256K threshold marking a significant cost increase point. Video input support is notable, as this capability remains relatively uncommon among broadly available language models.
Related Articles
Alibaba Qwen Releases 35B Sparse MoE Model with 262K Context and Multimodal Support
Alibaba Cloud has released Qwen3.6-35B-A3B, an open-weight sparse mixture-of-experts model with 35 billion total parameters but only 3 billion active parameters per token. The model features a 262K native context window (expandable to 1M tokens), multimodal input support, and integrated reasoning mode with preserved thinking traces.
Alibaba's Qwen Team Releases Qwen3.6 27B With 262K Context Window and Video Processing
Alibaba's Qwen Team has released Qwen3.6 27B, a 27-billion parameter multimodal language model with a 262,144-token context window. The model accepts text, image, and video inputs and includes a built-in thinking mode for extended reasoning, with pricing at $0.195 per million input tokens and $1.56 per million output tokens.
DeepSeek V4 Pro launches with 1.6 trillion parameters, 1M token context at $0.145 per million input tokens
Chinese AI lab DeepSeek has released preview versions of DeepSeek V4 Flash and V4 Pro, mixture-of-experts models with 1 million token context windows. The V4 Pro has 1.6 trillion total parameters (49 billion active), making it the largest open-weight model available, while both models significantly undercut frontier model pricing.
Alibaba Releases Qwen3.6 Max Preview: 1 Trillion Parameter MoE Model With 262K Context Window
Alibaba Cloud has released Qwen3.6 Max Preview, a proprietary frontier model built on sparse mixture-of-experts architecture with approximately 1 trillion total parameters. The model supports a 262,144-token context window and features integrated thinking mode for multi-turn reasoning, priced at $1.30 per million input tokens and $7.80 per million output tokens.
Comments
Loading...