Alibaba releases Qwen 3.6 Plus with 1M context window, free tier now available
Alibaba's Qwen division released Qwen 3.6 Plus on April 2, 2026, offering free access to a model with a 1,000,000 token context window. The model combines linear attention with sparse mixture-of-experts routing and achieves a 78.8 score on SWE-bench Verified for software engineering tasks.
Alibaba Releases Qwen 3.6 Plus with 1M Context Window at No Cost
Alibaba's Qwen division released Qwen 3.6 Plus on April 2, 2026, as a free-tier model available through OpenRouter and other providers. The model features a 1,000,000 token context window with $0 pricing for both input and output tokens.
Architecture and Performance
Qwen 3.6 Plus uses a hybrid architecture combining efficient linear attention with sparse mixture-of-experts (SMoE) routing. According to Alibaba, this design enables scalable inference while maintaining high performance across diverse tasks.
The model shows substantial improvements over the 3.5 series, particularly in:
- Agentic coding: Enhanced ability to write and understand agent-based code systems
- Front-end development: Improved handling of web application logic and UI patterns
- Reasoning: General reasoning capability gains across benchmark tasks
- "Vibe coding": Described as an improved experience for intuitive coding tasks
On SWE-bench Verified—a software engineering benchmark measuring repository-level problem solving—Qwen 3.6 Plus scores 78.8, placing it competitive with leading models.
Capabilities and Task Performance
Alibaba claims the model excels at complex multimodal and reasoning-heavy tasks including:
- 3D scene understanding and manipulation
- Game development and logic
- Repository-level code understanding and refactoring
- Pure-text and multimodal tasks at "state-of-the-art" levels
OpenRouter usage data shows the model handling approximately 1.77M prompt tokens daily, with 50K reasoning tokens and 6K completion tokens, indicating active adoption for both standard and reasoning-enabled inference.
Availability and Integration
The free tier removes cost barriers for developers evaluating the model. OpenRouter provides normalized API access across multiple providers, with automatic fallback to maximize uptime. The platform supports reasoning-enabled inference, allowing users to access the model's internal step-by-step thinking process through the reasoning_details parameter.
Developers can enable reasoning mode to observe intermediate reasoning steps before the final answer, with the ability to continue conversations while preserving complete reasoning chains.
What This Means
Alibaba's release of Qwen 3.6 Plus at no cost increases competition in the free-tier LLM space, directly challenging OpenAI's free ChatGPT tier and Meta's Llama availability. The 1M context window and strong SWE-bench Verified score (78.8) position it as a serious option for code-generation tasks and long-context applications. However, the absence of published benchmark scores across standard evals (MMLU, HumanEval, etc.) limits direct comparison to other leading models. The free pricing suggests Alibaba is prioritizing adoption and usage data gathering over near-term monetization.
Related Articles
Alibaba releases Qwen3.6-Plus with 1M token context, claims performance near Claude 4.5 Opus
Alibaba has released Qwen3.6-Plus, its third proprietary AI model in days, featuring a 1 million token context window available via Alibaba Cloud Model Studio API. The model claims improved agentic coding capabilities and partially outperforms Anthropic's Claude 4.5 Opus in Alibaba-conducted benchmarks, though trails Claude 4.6 Opus released in December 2025.
Alibaba's Qwen3.5-Omni learns to write code from speech and video without explicit training
Alibaba has released Qwen3.5-Omni, an omnimodal model handling text, images, audio, and video with a 256,000-token context window. The model reportedly outperforms Google's Gemini 3.1 Pro on audio tasks with support for 74 languages in speech recognition, a 6x increase from its predecessor. An unexpected emergent capability: writing working code from spoken instructions and video input, which the team did not explicitly train.
Alibaba releases Qwen 3.6 Plus Preview with 1M token context, free via OpenRouter
Alibaba's Qwen division has released Qwen 3.6 Plus Preview, a free multimodal model available via OpenRouter with a 1,000,000 token context window. The model claims stronger reasoning and more reliable agentic behavior compared to the 3.5 series, with particular strength in coding and complex problem-solving tasks.
NVIDIA releases gpt-oss-puzzle-88B, 88B-parameter reasoning model with 1.63× throughput gains
NVIDIA released gpt-oss-puzzle-88B on March 26, 2026, a 88-billion parameter mixture-of-experts model optimized for inference efficiency on H100 hardware. Built using the Puzzle post-training neural architecture search framework, the model achieves 1.63× throughput improvement in long-context (64K/64K) scenarios and up to 2.82× improvement on single H100 GPUs compared to its parent gpt-oss-120B, while matching or exceeding accuracy across reasoning effort levels.
Comments
Loading...