model release

Alibaba Qwen Releases 27B Parameter Model That Claims to Match 397B Performance on Coding Tasks

TL;DR

Alibaba Qwen released Qwen3.6-27B, a 27B parameter dense model that claims flagship-level coding performance surpassing their previous 397B MoE model across major coding benchmarks. The full model is 55.6GB compared to 807GB for the predecessor.

April 22, 2026 · 5:06 PM1 min read

Qwen3.6 27B — Quick Specs

Context window262K tokens

Input$0.195/1M tokens

Output$1.56/1M tokens

Compare Qwen3.6 27B with other models →

Alibaba Qwen Releases 27B Parameter Model That Claims to Match 397B Performance on Coding Tasks

Alibaba Qwen released Qwen3.6-27B, a 27B parameter dense model that claims to deliver "flagship-level agentic coding performance" surpassing their previous-generation Qwen3.5-397B-A17B (a 397B total parameter, 17B active MoE model) across all major coding benchmarks, according to the company.

The size difference is significant: Qwen3.6-27B is 55.6GB on Hugging Face compared to 807GB for Qwen3.5-397B-A17B. A quantized Q4_K_M version from Unsloth reduces the footprint to 16.8GB.

Performance Testing

Simon Willison tested the 16.8GB quantized version using llama.cpp's llama-server. In a test generating an SVG of a pelican riding a bicycle, the model produced a detailed, coherent image with correct bicycle geometry (spokes, chain, frame), a recognizable pelican, and background details including clouds, birds, and grass.

Performance metrics from llama-server:

Prompt processing: 54.32 tokens/s (20 tokens in 0.4s)
Generation speed: 25.57 tokens/s (4,444 tokens in 2min 53s)

Model Availability

Qwen3.6-27B is available as open weights on Hugging Face. The model includes reasoning mode support via the --reasoning on flag and uses a 65,536 token context window in testing configurations.

Specific benchmark scores and pricing were not disclosed in the announcement. The model represents Qwen's approach to achieving competitive coding performance in a significantly smaller architecture compared to their MoE models.

What This Means

If the coding benchmark claims hold up to independent verification, Qwen3.6-27B represents a substantial efficiency gain—achieving similar performance to a 397B parameter model in a 27B dense architecture. The 16.8GB quantized version running locally at 25 tokens/s makes flagship-level coding capabilities accessible on consumer hardware. However, the specific benchmarks and scores referenced in Qwen's "all major coding benchmarks" claim require independent validation.

Source: simonwillison.net ↗

qwen alibaba-qwen open-weights coding local-models quantization llama-cpp

model releaseJuly 21, 2026

Alibaba Releases Qwen-Image-3.0, an Image Generator That Renders 10-Pixel Text and 3x3 Infographic Grids in One Pass

Alibaba's Qwen team has released Qwen-Image-3.0, an image generator that accepts prompts up to 4,500 tokens and can render legible text as small as ten pixels, complex LaTeX formulas, and twelve languages in a single pass. The model is currently invite-only via API, and unlike its predecessor, it likely won't ship with open weights.

model releaseJuly 20, 2026

Alibaba releases Qwen 3.8, a 2.4 trillion parameter open-weight model claiming second place behind Fable 5

Alibaba has released Qwen 3.8, a 2.4 trillion parameter open-weight model that the company claims trails only Fable 5. The multimodal model processes images, videos, and documents, with a preview available through Alibaba's platforms at 10 percent of standard pricing.

model releaseJuly 20, 2026

Moonshot AI Releases Kimi K3: 2.8T Parameter Open Model at $3/$15 Per Million Tokens

Moonshot AI has released Kimi K3, a 2.8 trillion parameter model with 1 million token context window and native multimodal input. The model ranks #1 in Frontend Code Arena and #9 in Text Arena, with pricing at $3 per million input tokens and $15 per million output tokens—comparable to Claude Sonnet 5 pricing while delivering performance the company claims is near Claude Opus 4.8 and GPT-5.5.

model releaseJuly 20, 2026

Thinking Machines releases Inkling: 975B-parameter MoE model with Apache 2.0 license, first major US open-weight multimo

Thinking Machines Lab released Inkling, a mixture-of-experts model with 975B total parameters and 41B active parameters, trained on 45 trillion tokens across text, images, audio, and video. The Apache 2.0-licensed model supports up to 1M context and debuts alongside Inkling-Small (276B-A12B), marking what observers call the strongest US-based open-weight release to date.

Alibaba Qwen Releases 27B Parameter Model That Claims to Match 397B Performance on Coding Tasks

Qwen3.6 27B — Quick Specs

Alibaba Qwen Releases 27B Parameter Model That Claims to Match 397B Performance on Coding Tasks

Performance Testing

Model Availability

What This Means

Related Articles

Alibaba Releases Qwen-Image-3.0, an Image Generator That Renders 10-Pixel Text and 3x3 Infographic Grids in One Pass

Alibaba releases Qwen 3.8, a 2.4 trillion parameter open-weight model claiming second place behind Fable 5

Moonshot AI Releases Kimi K3: 2.8T Parameter Open Model at $3/$15 Per Million Tokens

Thinking Machines releases Inkling: 975B-parameter MoE model with Apache 2.0 license, first major US open-weight multimo

Comments