Moonshot AI Releases Kimi K2.6: 1T-Parameter MoE Model with 256K Context and Agent Swarm Capabilities
Moonshot AI has released Kimi K2.6, an open-source multimodal model with 1 trillion total parameters (32B activated) and 256K context window. The model achieves 80.2% on SWE-Bench Verified, 58.6% on SWE-Bench Pro, and supports horizontal scaling to 300 sub-agents executing 4,000 coordinated steps.
Moonshot AI Releases Kimi K2.6: 1T-Parameter MoE Model with 256K Context and Agent Swarm Capabilities
Moonshot AI has released Kimi K2.6, an open-source multimodal model with 1 trillion total parameters and 32 billion activated parameters per forward pass. The model supports a 256K token context window and is designed for long-horizon coding, autonomous agent orchestration, and coding-driven design tasks.
Architecture and Specifications
Kimi K2.6 uses a Mixture-of-Experts (MoE) architecture with 384 total experts, selecting 8 experts per token plus 1 shared expert. The model features:
- 61 total layers (including 1 dense layer)
- 7,168 attention hidden dimension
- 2,048 MoE hidden dimension per expert
- 64 attention heads
- 160K vocabulary size
- Multi-Latent Attention (MLA) mechanism
- SwiGLU activation function
- MoonViT vision encoder with 400M parameters
The model is available with native INT4 quantization and can be deployed on vLLM, SGLang, and KTransformers inference engines.
Benchmark Performance
On coding benchmarks, Kimi K2.6 achieves 80.2% on SWE-Bench Verified (averaged over 10 runs), 58.6% on SWE-Bench Pro, and 76.7% on SWE-Bench Multilingual. The model scores 66.7% on Terminal-Bench 2.0 and 89.6% on LiveCodeBench v6.
For agentic tasks with tool use, the model reaches 54.0% on HLE-Full (compared to 52.1% for GPT-5.4 and 53.0% for Claude Opus 4.6). On BrowseComp, it scores 83.2% in single-agent mode and 86.3% using agent swarm capabilities. For deep research tasks, Kimi K2.6 achieves 92.5% F1-score and 83.0% accuracy on DeepSearchQA.
On reasoning benchmarks, the model scores 96.4% on AIME 2026, 92.7% on HMMT 2026, and 90.5% on GPQA-Diamond. Vision-language performance includes 79.4% on MMMU-Pro (80.1% with Python tool use) and 87.4% on MathVision (93.2% with Python).
Agent Swarm Architecture
According to Moonshot AI, Kimi K2.6 can scale horizontally to 300 sub-agents executing 4,000 coordinated steps. The system dynamically decomposes tasks into parallel, domain-specialized subtasks and can generate end-to-end outputs including documents, websites, and spreadsheets in autonomous runs. The company claims the model supports persistent, 24/7 background agents for proactive task management.
Availability and API
Kimi K2.6 is available through Moonshot AI's API platform at platform.moonshot.ai with OpenAI and Anthropic-compatible APIs. Pricing has not been disclosed. The model supports two modes: Thinking mode (recommended temperature 1.0) and Instant mode (recommended temperature 0.6), both with top_p of 0.95.
The model requires transformers version >=4.57.1, <5.0.0 for deployment. Video content chat is currently an experimental feature available only through the official API.
What This Means
Kimi K2.6 represents a significant architectural approach to scaling agent capabilities through horizontal swarm orchestration rather than just vertical reasoning depth. The 80.2% SWE-Bench Verified score places it competitively with frontier models, though its real differentiation appears in multi-agent coordination benchmarks where it shows gains of 8-10 percentage points in swarm mode versus single-agent operation. The 256K context window and native support for 4,000-step execution traces suggest the model is optimized for complex, long-running autonomous workflows rather than single-shot inference tasks.
Related Articles
Alibaba Releases Qwen3.6-35B-A3B: 35B Parameter MoE Model with 262K Context Window
Alibaba has released Qwen3.6-35B-A3B, the first open-weight model in the Qwen3.6 series. The model features 35B total parameters with 3B activated, a native 262K context window extensible to 1.01M tokens, and achieves 73.4% on SWE-bench Verified using 256 experts with 8 activated per token.
Alibaba Qwen Releases 35B Parameter Qwen3.6-35B-A3B Model with 262K Native Context Window
Alibaba Qwen has released Qwen3.6-35B-A3B, a 35-billion parameter mixture-of-experts model with 3 billion activated parameters and a 262,144-token native context window extendable to 1,010,000 tokens. The model scores 73.4 on SWE-bench Verified and features FP8 quantization with performance metrics nearly identical to the original model.
Anthropic ships Claude Opus 4.7 with improved coding reliability and multimodal capabilities
Anthropic has released Claude Opus 4.7, its latest generally available AI model focused on advanced software engineering. The model shows improvements in handling complex coding tasks with less supervision, enhanced vision capabilities, and better instruction following, while introducing a new tokenizer that increases token usage by 1.0-1.35× depending on content type.
Tencent Releases HY-World 2.0: Open-Source Multi-Modal Model Generates 3D Worlds from Text and Images
Tencent has released HY-World 2.0, an open-source multi-modal world model that generates navigable 3D environments from text prompts, single images, multi-view images, or video. The model produces editable 3D assets including meshes and 3D Gaussian Splattings that can be directly imported into game engines like Unity and Unreal Engine.
Comments
Loading...