model release

Alibaba releases Qwen 3.6 Plus with 1M context window, free tier now available

TL;DR

Alibaba's Qwen division released Qwen 3.6 Plus on April 2, 2026, offering free access to a model with a 1,000,000 token context window. The model combines linear attention with sparse mixture-of-experts routing and achieves a 78.8 score on SWE-bench Verified for software engineering tasks.

2 min read
0

Alibaba Releases Qwen 3.6 Plus with 1M Context Window at No Cost

Alibaba's Qwen division released Qwen 3.6 Plus on April 2, 2026, as a free-tier model available through OpenRouter and other providers. The model features a 1,000,000 token context window with $0 pricing for both input and output tokens.

Architecture and Performance

Qwen 3.6 Plus uses a hybrid architecture combining efficient linear attention with sparse mixture-of-experts (SMoE) routing. According to Alibaba, this design enables scalable inference while maintaining high performance across diverse tasks.

The model shows substantial improvements over the 3.5 series, particularly in:

  • Agentic coding: Enhanced ability to write and understand agent-based code systems
  • Front-end development: Improved handling of web application logic and UI patterns
  • Reasoning: General reasoning capability gains across benchmark tasks
  • "Vibe coding": Described as an improved experience for intuitive coding tasks

On SWE-bench Verified—a software engineering benchmark measuring repository-level problem solving—Qwen 3.6 Plus scores 78.8, placing it competitive with leading models.

Capabilities and Task Performance

Alibaba claims the model excels at complex multimodal and reasoning-heavy tasks including:

  • 3D scene understanding and manipulation
  • Game development and logic
  • Repository-level code understanding and refactoring
  • Pure-text and multimodal tasks at "state-of-the-art" levels

OpenRouter usage data shows the model handling approximately 1.77M prompt tokens daily, with 50K reasoning tokens and 6K completion tokens, indicating active adoption for both standard and reasoning-enabled inference.

Availability and Integration

The free tier removes cost barriers for developers evaluating the model. OpenRouter provides normalized API access across multiple providers, with automatic fallback to maximize uptime. The platform supports reasoning-enabled inference, allowing users to access the model's internal step-by-step thinking process through the reasoning_details parameter.

Developers can enable reasoning mode to observe intermediate reasoning steps before the final answer, with the ability to continue conversations while preserving complete reasoning chains.

What This Means

Alibaba's release of Qwen 3.6 Plus at no cost increases competition in the free-tier LLM space, directly challenging OpenAI's free ChatGPT tier and Meta's Llama availability. The 1M context window and strong SWE-bench Verified score (78.8) position it as a serious option for code-generation tasks and long-context applications. However, the absence of published benchmark scores across standard evals (MMLU, HumanEval, etc.) limits direct comparison to other leading models. The free pricing suggests Alibaba is prioritizing adoption and usage data gathering over near-term monetization.

Related Articles

model release

DeepReinforce Releases Ornith-1.0, Open-Source Agentic Coding Model in 9B to 397B Sizes

DeepReinforce has released Ornith-1.0, an MIT-licensed model designed for agentic coding tasks with variants ranging from 9B to 397B parameters. Built on top of Apache 2.0-licensed Gemma 4 and Qwen 3.5 base models, the company claims it achieves state-of-the-art performance among open-source models of comparable size on coding benchmarks.

model release

Mistral releases Leanstral 1.5: 119B parameter open-source model for Lean 4 proof assistance

Mistral AI has released Leanstral 1.5, an open-source 119B parameter mixture-of-experts model designed specifically for Lean 4 proof assistance. The model features 128 experts with 4 active per token (6.5B activated parameters), a 256k token context window, and multimodal input capabilities.

model release

Google releases Gemini 3.1 Flash Lite Image, its fastest and cheapest image generation model

Google has released Gemini 3.1 Flash Lite Image, also called Nano Banana 2 Lite, which the company describes as its fastest and cheapest image generation model. The model is available through Google's AI Studio and Gemini API with the identifier gemini-3.1-flash-lite-image.

model release

Claude Sonnet 5 ships with 1M token context and new tokenizer that increases costs 30-40% for English text

Anthropic released Claude Sonnet 5 with a 1 million token context window and 128,000 token maximum output. The model removes traditional sampling parameters and introduces a new tokenizer that generates approximately 30% more tokens than Sonnet 4.6 for the same English text—effectively a significant price increase despite unchanged nominal rates of $3/million input and $15/million output tokens.

Comments

Loading...