model releaseInclusionai

InclusionAI Releases Ring-2.6-1T: 1 Trillion Parameter Thinking Model with 63B Active Parameters

TL;DR

InclusionAI has released Ring-2.6-1T, a 1 trillion parameter-scale model with 63 billion active parameters and a 262,144-token context window. The model features adaptive reasoning modes and is designed for coding agents, tool use, and long-horizon task execution.

2 min read
1

InclusionAI Releases Ring-2.6-1T: 1 Trillion Parameter Thinking Model with 63B Active Parameters

InclusionAI has released Ring-2.6-1T, a thinking model with 1 trillion parameters at scale but only 63 billion active parameters during inference. The model is now available on OpenRouter with a 262,144-token context window.

Architecture and Performance

Ring-2.6-1T uses a sparse architecture that activates 63B of its 1T total parameters, designed to balance capability with operational efficiency. According to InclusionAI, the model delivers leading results on PinchBench, ClawEval, TAU2-Bench, and GAIA2-search benchmarks, though specific scores were not disclosed.

The model features adaptive reasoning with "high" and "xhigh" modes that dynamically allocate reasoning budget based on task complexity. This approach aims to reduce token overhead in multi-turn agent workflows compared to fixed reasoning strategies.

Target Use Cases

InclusionAI positions Ring-2.6-1T for three primary applications:

  • Coding agents: Advanced code generation and debugging workflows
  • Tool use: Multi-step operations requiring external API calls and function execution
  • Long-horizon tasks: Complex autonomous systems that require planning across extended interactions

The model's 262K context window enables handling of large codebases and extended conversation histories without truncation.

Availability and Pricing

Ring-2.6-1T is available through OpenRouter's platform with a "free" tier, though specific pricing details for paid usage tiers were not provided. The model was released on May 8, 2026, making it one of the first major releases of that year.

No information about alternative API access, self-hosting options, or licensing terms has been disclosed.

What This Means

The sparse activation approach—using only 63B of 1T parameters—represents a continued industry trend toward mixture-of-experts and conditional compute architectures that reduce inference costs while maintaining model capacity. The 262K context window places Ring-2.6-1T among longer-context models, though it remains below the 1M+ token windows recently announced by some competitors. The focus on agent workflows and tool use suggests InclusionAI is targeting the growing market for autonomous AI systems rather than pure chat applications. However, without disclosed benchmark scores or third-party validation, actual performance relative to established models like GPT-4, Claude, or DeepSeek remains uncertain.

Related Articles

model release

Mistral releases Leanstral, open-source 6B-parameter proof assistant for Lean 4 under Apache 2.0

Mistral AI has released Leanstral, a sparse 120B model with 6B active parameters designed specifically for the Lean 4 proof assistant. The model is available under Apache 2.0 license with free API access and achieves a 26.3 FLTEval score at pass@2, outperforming Claude Sonnet 4.6 while costing $36 versus $549.

model release

Zhipu AI releases GLM-5.2 with 1M token context and 62.1% SWE-bench Pro score

Zhipu AI released GLM-5.2, a 753 billion parameter model with a 1 million token context window. The model scores 62.1% on SWE-bench Pro and introduces IndexShare architecture that reduces per-token FLOPs by 2.9× at 1M context length. Released under MIT license with no regional restrictions.

model release

NVIDIA Releases Quantized DiffusionGemma 26B: 1,100+ Tokens/Second with 256K Context Window

NVIDIA released a quantized version of Google DeepMind's DiffusionGemma 26B A4B IT, a multimodal model with 25.2B total parameters (3.8B active) that processes text, image, and video inputs. The NVFP4-quantized model achieves generation speeds exceeding 1,100 tokens per second on NVIDIA H100 GPUs while supporting a 256K token context window.

model release

Z.AI releases GLM-5.2 with 1M token context, outperforms GPT-5.5 on long-horizon coding benchmarks

Z.AI has released GLM-5.2, an open-source model with a 1M-token context window under an MIT license. On FrontierSWE, a long-horizon coding benchmark, GLM-5.2 trails Claude Opus 4.8 by 1% while outperforming GPT-5.5 by 1%, and achieves 81.0 on Terminal-Bench 2.1 compared to Opus 4.8's 85.0.

Comments

Loading...