Alibaba Qwen Releases 27B Parameter Model That Claims to Match 397B Performance on Coding Tasks
Alibaba Qwen released Qwen3.6-27B, a 27B parameter dense model that claims flagship-level coding performance surpassing their previous 397B MoE model across major coding benchmarks. The full model is 55.6GB compared to 807GB for the predecessor.
Alibaba Qwen Releases 27B Parameter Model That Claims to Match 397B Performance on Coding Tasks
Alibaba Qwen released Qwen3.6-27B, a 27B parameter dense model that claims to deliver "flagship-level agentic coding performance" surpassing their previous-generation Qwen3.5-397B-A17B (a 397B total parameter, 17B active MoE model) across all major coding benchmarks, according to the company.
The size difference is significant: Qwen3.6-27B is 55.6GB on Hugging Face compared to 807GB for Qwen3.5-397B-A17B. A quantized Q4_K_M version from Unsloth reduces the footprint to 16.8GB.
Performance Testing
Simon Willison tested the 16.8GB quantized version using llama.cpp's llama-server. In a test generating an SVG of a pelican riding a bicycle, the model produced a detailed, coherent image with correct bicycle geometry (spokes, chain, frame), a recognizable pelican, and background details including clouds, birds, and grass.
Performance metrics from llama-server:
- Prompt processing: 54.32 tokens/s (20 tokens in 0.4s)
- Generation speed: 25.57 tokens/s (4,444 tokens in 2min 53s)
Model Availability
Qwen3.6-27B is available as open weights on Hugging Face. The model includes reasoning mode support via the --reasoning on flag and uses a 65,536 token context window in testing configurations.
Specific benchmark scores and pricing were not disclosed in the announcement. The model represents Qwen's approach to achieving competitive coding performance in a significantly smaller architecture compared to their MoE models.
What This Means
If the coding benchmark claims hold up to independent verification, Qwen3.6-27B represents a substantial efficiency gain—achieving similar performance to a 397B parameter model in a 27B dense architecture. The 16.8GB quantized version running locally at 25 tokens/s makes flagship-level coding capabilities accessible on consumer hardware. However, the specific benchmarks and scores referenced in Qwen's "all major coding benchmarks" claim require independent validation.
Related Articles
Alibaba Qwen Releases 27B Parameter Model with 262K Context Window, Claims 77.2% on SWE-bench Verified
Alibaba Qwen released Qwen3.6-27B, a 27-billion parameter model with a 262,144 token context window extensible to 1,010,000 tokens. The model claims 77.2% on SWE-bench Verified and 53.5% on SWE-bench Pro, with open weights available on Hugging Face.
Gemma 4 VLA runs locally on NVIDIA Jetson Orin Nano Super with 8GB RAM, autonomous webcam tool-calling
NVIDIA engineer Asier Arranz demonstrated Gemma 4 running as a vision-language agent (VLA) on a Jetson Orin Nano Super with 8GB RAM. The model autonomously decides when to access a webcam based on user queries, with no hardcoded triggers—performing speech-to-text, vision analysis, and text-to-speech entirely locally.
Alibaba Qwen Releases 35B Parameter Qwen3.6-35B-A3B Model with 262K Native Context Window
Alibaba Qwen has released Qwen3.6-35B-A3B, a 35-billion parameter mixture-of-experts model with 3 billion activated parameters and a 262,144-token native context window extendable to 1,010,000 tokens. The model scores 73.4 on SWE-bench Verified and features FP8 quantization with performance metrics nearly identical to the original model.
Anthropic releases Claude Opus 4.7 with improved coding and vision, confirms it trails unreleased Mythos model
Anthropic released Claude Opus 4.7 with improved coding capabilities, higher-resolution vision, and a new reasoning level. The company publicly acknowledged the model underperforms its unreleased Mythos system, which remains restricted due to safety concerns.
Comments
Loading...