model release

Z.ai releases GLM-5.1, 754B parameter open-weight model with improved code generation

TL;DR

Z.ai has released GLM-5.1, a 754-billion parameter open-weight model matching the size of its predecessor GLM-5. The model demonstrates improved ability to generate complex, multi-part outputs like HTML pages with SVG graphics and CSS animations, available via Hugging Face and OpenRouter.

April 7, 2026 · 9:35 PM2 min read

Z.ai has released GLM-5.1, a 754-billion parameter open-weight model designed for long-horizon tasks. The model is available under an MIT license on Hugging Face (1.51TB) and via OpenRouter.

Model Specifications

GLM-5.1 maintains the same 754B parameter count as the original GLM-5 release, sharing the underlying architecture described in the same research paper. Despite identical scale, the updated version claims improved performance on complex multi-step generation tasks.

Demonstrated Capabilities

Early testing reveals the model's strength in code generation and creative synthesis. When prompted to generate SVG graphics, GLM-5.1 unprompted produced complete HTML pages that bundled SVG artwork with accompanying CSS animations—a behavior suggesting the model has learned to anticipate contextual requirements beyond explicit instructions.

In one test, the model generated an animated pelican on a bicycle with anatomical accuracy and added CSS animations for wheel rotation, pedal movement, and cloud parallax. When the animation initially broke (CSS transforms overriding SVG positioning), the model correctly diagnosed the issue in a follow-up prompt: "CSS transform animations on SVG elements override the SVG transform attribute used for positioning, causing the pelican to lose its placement."

The model then auto-corrected the code by switching to <animateTransform> for SVG rotations and separating positioning from animation logic. The corrected output included nuanced details like a wobbling pelican beak animated with precise SVG scaling transforms.

Open-Weight Availability

Unlike proprietary competitors, GLM-5.1's full weights are openly available, making it accessible for local deployment and fine-tuning. The 1.51TB model size places it in the realm of infrastructure-heavy inference, though quantized versions have not been announced.

Access via OpenRouter allows API-based usage without local compute requirements. No pricing information has been disclosed for either platform.

Context and Positioning

GLM-5.1 arrives amid competitive pressure from larger models (including GPT-4, Claude 3.5, and others in the 100B+ parameter range). Z.ai's emphasis on "long-horizon tasks" suggests the model targets multi-step reasoning and sustained coherence over extended generations—domains where larger models often excel but many open-weight alternatives struggle.

The model's demonstrated ability to self-correct based on user feedback and generate contextually appropriate wrapper code (HTML + SVG + CSS in one output) hints at improvements in planning and output formatting beyond what parameter count alone would predict.

What This Means

GLM-5.1 represents a credible open-weight alternative for developers needing strong code generation and multi-modal creative outputs without relying on commercial APIs. The model's ability to generate complete, working applications (not just code snippets) suggests promise for automation of frontend development and technical documentation. However, the 1.51TB inference footprint limits practical deployment to well-resourced organizations. The real competitive advantage lies in open-weight availability—competitors' equivalent models remain proprietary, making GLM-5.1 valuable for research, customization, and cost-sensitive production deployments.

Source: simonwillison.net ↗

glm-5.1 open-weights 754b-parameters code-generation z-ai long-horizon-tasks model-release mit-license

model releaseJune 30, 2026

Google releases Gemini 3.1 Flash Lite Image, its fastest and cheapest image generation model

Google has released Gemini 3.1 Flash Lite Image, also called Nano Banana 2 Lite, which the company describes as its fastest and cheapest image generation model. The model is available through Google's AI Studio and Gemini API with the identifier gemini-3.1-flash-lite-image.

model releaseJune 30, 2026

Claude Sonnet 5 ships with 1M token context and new tokenizer that increases costs 30-40% for English text

Anthropic released Claude Sonnet 5 with a 1 million token context window and 128,000 token maximum output. The model removes traditional sampling parameters and introduces a new tokenizer that generates approximately 30% more tokens than Sonnet 4.6 for the same English text—effectively a significant price increase despite unchanged nominal rates of $3/million input and $15/million output tokens.

model releaseJune 30, 2026

Claude Sonnet 5 launches on AWS Bedrock with Opus-level intelligence at Sonnet pricing

Anthropic has released Claude Sonnet 5 on Amazon Bedrock and Claude Platform on AWS. The model delivers what Anthropic describes as near-Opus intelligence while maintaining Sonnet-tier pricing, with promotional rates available through August 31, 2026.

model releaseJune 30, 2026

Anthropic releases Claude Sonnet 5 at $2/1M input tokens, 63.2% agentic coding benchmark

Anthropic has released Claude Sonnet 5, its new mid-tier model optimized for agentic tasks, priced at $2 per million input tokens through August 31 before rising to $3/1M. The model scores 63.2% on agentic coding benchmarks, approaching Opus 4.8's 69.2% performance at a significantly lower price point.

Z.ai releases GLM-5.1, 754B parameter open-weight model with improved code generation

Model Specifications

Demonstrated Capabilities

Open-Weight Availability

Context and Positioning

What This Means

Related Articles

Google releases Gemini 3.1 Flash Lite Image, its fastest and cheapest image generation model

Claude Sonnet 5 ships with 1M token context and new tokenizer that increases costs 30-40% for English text

Claude Sonnet 5 launches on AWS Bedrock with Opus-level intelligence at Sonnet pricing

Anthropic releases Claude Sonnet 5 at $2/1M input tokens, 63.2% agentic coding benchmark

Comments