model releaseCohere

Cohere releases North Mini Code, a 30B-parameter sparse MoE coding model with 256K context window, free on OpenRouter

TL;DR

Cohere has released North Mini Code, the first model in its North family and its first agentic coding model. The sparse mixture-of-experts architecture features 30B total parameters with 3B active, a 256K-token context window, and up to 64K tokens of output, available free via OpenRouter under Apache 2.0 license.

2 min read
0

Cohere Releases North Mini Code, Free Agentic Coding Model with 256K Context

Cohere has released North Mini Code, a sparse mixture-of-experts (MoE) coding model with 30 billion total parameters and 3 billion active parameters. The model is available free through OpenRouter and marks both Cohere's first agentic coding model and the debut of its North model family.

Technical Specifications

North Mini Code features a 256K-token context window with support for up to 64K tokens of output. The sparse MoE architecture activates only 3B of its 30B total parameters per inference, which according to Cohere enables low-latency inference including on local hardware.

The model is released open-weight under the Apache 2.0 license. Released June 17, 2026, it supports interleaved reasoning and tool use via JSON schema.

Capabilities and Training

Cohere claims the model is optimized for:

  • Code generation
  • Agentic software engineering
  • Terminal tasks

According to the company, North Mini Code is trained to generalize across agent harnesses including OpenCode and SWE-Agent, though specific benchmark scores were not disclosed at release.

Availability

The model is currently hosted exclusively through OpenRouter, where it is offered at no cost. OpenRouter forwards all requests directly to the provider with no routing decisions required.

No pricing has been announced for commercial API access outside of OpenRouter's free tier.

What This Means

North Mini Code represents Cohere's entry into the specialized coding model market, competing with offerings from OpenAI (Codex), Anthropic (Claude with coding focus), and open models like DeepSeek Coder. The sparse MoE architecture's 3B active parameters positions it as a lightweight option for local deployment while maintaining the capacity of a 30B-parameter model.

The 256K context window is notable for a coding model, enabling it to work with larger codebases in a single context. However, without published benchmark scores on standard coding evaluations like HumanEval or MBPP, the model's actual performance relative to competitors remains unclear. The free availability on OpenRouter will likely drive early adoption and testing.

Related Articles

model release

Amazon Bedrock adds Gemma 4 models with 256K context and built-in reasoning mode

Amazon Web Services today announced availability of Google DeepMind's Gemma 4 family on Amazon Bedrock. The open-weight models include three instruction-tuned variants spanning 2.3B to 30.7B parameters, with 256K context windows, multimodal input support, and built-in reasoning mode.

model release

Moonshot AI releases Kimi K2.7 Code with 1T parameters, 256K context window, 30% lower thinking token usage

Moonshot AI has released Kimi K2.7 Code, a 1 trillion parameter Mixture-of-Experts model designed for long-horizon coding tasks. The model features a 256K context window and reduces thinking token usage by approximately 30% compared to its predecessor K2.6.

model release

Google DeepMind releases DiffusionGemma, a 26B parameter model generating 15-20 tokens per forward pass via discrete dif

Google DeepMind released DiffusionGemma, a 26B parameter mixture-of-experts model that generates text using discrete diffusion instead of autoregression. The model processes blocks of 256 tokens in parallel, achieving generation speeds exceeding 1100 tokens per second on H100 GPUs in low-batch settings.

model release

Cohere Releases North Mini Code 1.0: 30B-Parameter MoE Model With 256K Context for Agentic Coding

Cohere Labs has released North Mini Code 1.0, a 30B-parameter sparse Mixture-of-Experts model with 3B active parameters and a 256K context window. The Apache 2.0-licensed model is optimized for agentic software engineering, featuring 128 experts with 8 activated per token, and trained specifically for tool use in coding tasks.

Comments

Loading...