model releaseZhipu AI

Zhipu AI releases GLM-5.2 with 1M token context and 62.1% SWE-bench Pro score

TL;DR

Zhipu AI released GLM-5.2, a 753 billion parameter model with a 1 million token context window. The model scores 62.1% on SWE-bench Pro and introduces IndexShare architecture that reduces per-token FLOPs by 2.9× at 1M context length. Released under MIT license with no regional restrictions.

2 min read
0

Zhipu AI releases GLM-5.2 with 1M token context and 62.1% SWE-bench Pro score

Zhipu AI released GLM-5.2, a 753 billion parameter model with a 1 million token context window and enhanced coding capabilities. The model scores 62.1% on SWE-bench Pro, positioning it between GPT-5.5 (58.6%) and Claude Opus 4.8 (69.2%) according to the company's benchmarks.

Key specifications

GLM-5.2 achieves the following benchmark scores according to Zhipu AI:

  • SWE-bench Pro: 62.1%
  • NL2Repo: 48.9%
  • DeepSWE: 46.2%
  • ProgramBench: 63.7%
  • AIME 2026: 99.2%
  • GPQA-Diamond: 91.2%
  • HMMT November 2025: 94.4%

The model supports 1 million token context length and is available in FP8 quantization format. Parameter count is 753 billion. Pricing has not been disclosed.

Technical architecture

GLM-5.2 introduces IndexShare, which reuses the same indexer across every four sparse attention layers. According to Zhipu AI, this reduces per-token FLOPs by 2.9× at 1M context length compared to the previous architecture.

The model includes an improved MTP (Multi-Token Prediction) layer for speculative decoding, which the company claims increases acceptance length by up to 20%.

GLM-5.2 offers multiple "thinking effort levels" for coding tasks, allowing developers to balance between performance and latency depending on task complexity.

Deployment and availability

The model is released under MIT open-source license with no regional restrictions. It supports deployment through:

  • SGLang (v0.5.13.post1+)
  • vLLM (v0.23.0+)
  • Transformers (v0.5.12+)
  • KTransformers (v0.5.12+)

For Ascend NPU platforms, vLLM-Ascend, xLLM, and SGLang are supported.

API access is available through Z.ai API Platform, though pricing details have not been published.

Benchmark positioning

On coding benchmarks, GLM-5.2 shows improvements over its predecessor GLM-5.1 (58.4% on SWE-bench Pro) and claims to match or exceed Qwen3.7-Max (60.6%) and MiniMax M3 (59%). However, it trails Claude Opus 4.8 (69.2%) and DeepSeek-V4-Pro (55.4%) on several metrics.

On the HLE reasoning benchmark, GLM-5.2 scores 40.5, behind Claude Opus 4.8 (49.8) and Gemini 3.1 Pro (45), but ahead of DeepSeek-V4-Pro (37.7) and MiniMax M3 (37).

What this means

GLM-5.2 represents Zhipu AI's push into the 1M context space dominated by models like Gemini 2.5 Pro and Claude 3.7 Sonnet. The IndexShare architecture addresses a key challenge in long-context models: computational efficiency at extended lengths. The 2.9× reduction in FLOPs at 1M tokens could make the model more practical for deployment compared to architectures that scale linearly with context length. The MIT license removes barriers common in Chinese AI models, potentially increasing adoption in Western markets. However, without disclosed pricing and independent benchmark verification, actual competitiveness against Anthropic and OpenAI's offerings remains to be demonstrated in production environments.

Related Articles

model release

GLM-5.2 Released with 1M Token Context and 753B Parameters Under MIT License

Zhipu AI has released GLM-5.2, a 753 billion parameter model featuring a 1 million token context window and MIT open-source license. The model scores 62.1% on SWE-bench Pro and 91.2% on GPQA-Diamond, with flexible reasoning effort levels for coding tasks.

model release

Z.AI releases GLM-5.2 with 1M token context, outperforms GPT-5.5 on long-horizon coding benchmarks

Z.AI has released GLM-5.2, an open-source model with a 1M-token context window under an MIT license. On FrontierSWE, a long-horizon coding benchmark, GLM-5.2 trails Claude Opus 4.8 by 1% while outperforming GPT-5.5 by 1%, and achieves 81.0 on Terminal-Bench 2.1 compared to Opus 4.8's 85.0.

model release

Z.ai Releases GLM-5.2 with 1M Token Context Window at $1.40/$4.40 per Million

Z.ai has released GLM-5.2, a model designed for long-horizon engineering tasks with a 1 million token context window. The model is priced at $1.40 per million input tokens and $4.40 per million output tokens, and was released on June 16, 2025.

model release

Microsoft Releases FastContext-1.0: 4B-Parameter Repository Explorer Cuts Coding Agent Token Use by 60%

Microsoft released FastContext-1.0, a lightweight repository-exploration subagent for LLM coding agents spanning 4B to 30B parameters. The model reduced main-agent token consumption by up to 60% while improving end-to-end resolution rates by up to 5.5% on SWE-bench Pro when integrated with agents like GPT-5.4 and GLM-5.1.

Comments

Loading...