model release

Z.ai Releases GLM-5.2 with 1M Token Context Window at $1.40/$4.40 per Million

TL;DR

Z.ai has released GLM-5.2, a model designed for long-horizon engineering tasks with a 1 million token context window. The model is priced at $1.40 per million input tokens and $4.40 per million output tokens, and was released on June 16, 2025.

June 16, 2026 · 6:05 PM2 min read

GLM-5.2 — Quick Specs

Context window1000K tokens

Input$0.826/1M tokens

Output$2.596/1M tokens

Compare GLM-5.2 with other models →

Z.ai Releases GLM-5.2 with 1M Token Context Window

Z.ai has released GLM-5.2, a model with a 1 million token context window designed for project-level engineering tasks. The model is priced at $1.40 per million input tokens and $4.40 per million output tokens.

Technical Specifications

Context window: 1 million tokens
Input pricing: $1.40 per 1M tokens
Output pricing: $4.40 per 1M tokens
Release date: June 16, 2025
Modalities: Text in/out

Claimed Capabilities

According to Z.ai, GLM-5.2 is positioned as their "flagship model for the era of long-horizon tasks." The company claims the model can:

Handle project-level engineering context
Execute long-running tasks with improved reliability
Follow engineering standards consistently
Complete full development workflows from requirements to multi-platform deployment

The model is currently hosted exclusively through OpenRouter, with all requests forwarded directly to Z.ai's infrastructure.

Pricing Context

At $1.40/$4.40 per million tokens, GLM-5.2's pricing positions it in the mid-range for large context window models. For comparison:

Anthropic's Claude 3.5 Sonnet (200K context): $3/$15 per 1M tokens
OpenAI's GPT-4o (128K context): $2.50/$10 per 1M tokens
Google's Gemini 1.5 Pro (2M context): $1.25/$5 per 1M tokens

OpenRouter notes that effective pricing can be 60-80% lower than list prices when prompt caching is applied for repeated context.

What This Means

GLM-5.2 enters a competitive market for long-context models, where context window size alone no longer differentiates offerings. The 1M token window matches several existing models, while models like Gemini 1.5 Pro already offer 2M tokens at comparable pricing. The real test will be whether Z.ai's claimed advantages in long-horizon task execution and engineering workflow completion translate to measurable performance improvements in production use cases. Without published benchmark scores or independent verification of the model's capabilities, its market position remains uncertain.

Source: openrouter.ai ↗

Z.ai GLM-5.2 model-release long-context 1M-context-window engineering-ai

model releaseJuly 31, 2026

Google DeepMind Launches Gemini Robotics 2, a Single VLA Model for Arms to Humanoids

Google DeepMind has introduced Gemini Robotics 2, a vision-language-action model it calls its most advanced yet, designed to control everything from tabletop robot arms to full-body humanoids. The company also released Gemini Robotics ER 2, an embodied reasoning model that replaces ER 1.6.

model releaseJuly 31, 2026

Thinking Machines Releases Inkling Small, a 12B-Active-Parameter Model That Beats Its Larger Predecessor on Key Benchmar

Thinking Machines has released Inkling Small, an open-weights reasoning model with 276 billion total parameters but only 12 billion active. According to Artificial Analysis, it scores nearly as high as the company's larger Inkling model while using roughly a third of the parameters and far fewer output tokens per task.

model releaseJuly 31, 2026

DeepSeek Releases V4-Flash-0731, a 284B-Parameter Model That Beats Its Own Larger Pro Variant on Agentic Benchmarks

DeepSeek has shipped the full release of DeepSeek-V4-Flash-0731, a 284B-parameter model that according to DeepSeek outperforms its own larger V4-Pro (Preview) on agentic and coding benchmarks. Unsloth has published quantized GGUF versions, with lossless 8-bit weights requiring 162GB of storage.

model releaseJuly 31, 2026

Thinking Machines Lab Releases Inkling Small: 276B MoE Model with 524K Context Window

Thinking Machines Lab has released Inkling Small, an open-weight multimodal mixture-of-experts model with 12B active parameters out of 276B total and a 524K token context window. The model targets reasoning, coding, agentic workflows, and multilingual use cases at $0.58 per 1M input tokens and $1.44 per 1M output tokens.

Z.ai Releases GLM-5.2 with 1M Token Context Window at $1.40/$4.40 per Million

GLM-5.2 — Quick Specs

Z.ai Releases GLM-5.2 with 1M Token Context Window

Technical Specifications

Claimed Capabilities

Pricing Context

What This Means

Related Articles

Google DeepMind Launches Gemini Robotics 2, a Single VLA Model for Arms to Humanoids

Thinking Machines Releases Inkling Small, a 12B-Active-Parameter Model That Beats Its Larger Predecessor on Key Benchmar

DeepSeek Releases V4-Flash-0731, a 284B-Parameter Model That Beats Its Own Larger Pro Variant on Agentic Benchmarks

Thinking Machines Lab Releases Inkling Small: 276B MoE Model with 524K Context Window

Comments