model releasePoolside

Poolside releases Laguna M.1: 225B parameter MoE model scores 74.6% on SWE-bench Verified

TL;DR

Poolside has released Laguna M.1, a 225B total parameter Mixture-of-Experts model with 23B activated parameters per token, designed for agentic coding tasks. The model scores 74.6% on SWE-bench Verified and 63.1% on SWE-bench Multilingual, released under Apache 2.0 license.

2 min read
0

Poolside Releases Laguna M.1: 225B Parameter MoE for Agentic Coding

Poolside has released Laguna M.1, a 225B total parameter Mixture-of-Experts (MoE) model with 23B activated parameters per token, designed specifically for agentic coding and long-horizon development work.

Architecture and Specifications

Laguna M.1 uses a 70-layer MoE transformer with 256 experts and top-k=16 routing. The first 3 layers are dense SwiGLU, while the remaining 67 layers use sparse MoE. The model employs global attention across all layers with 64 Q-heads, 8 KV-heads, and head dimension 128. Context window extends to 262,144 tokens.

The model uses RoPE with YaRN for positional encoding and includes native reasoning support through interleaved thinking between tool calls. Training involved pre-training, post-training, and reinforcement learning stages using the Muon optimizer.

Benchmark Performance

Poolside reports the following scores on agentic coding benchmarks (averaged over 4 runs at temperature=1.0, top_k=20):

  • SWE-bench Verified: 74.6%
  • SWE-bench Multilingual: 63.1%
  • SWE-bench Pro: 49.2%
  • Terminal-Bench 2.0: 45.8%

For comparison, DeepSeek-V4 Flash (284B total, 13B active) scores 79.0% on SWE-bench Verified, while Qwen3.5 (397B total, 17B active) achieves 76.2%. Claude Sonnet 4.6 scored 79.6% on SWE-bench Verified and 59.1% on Terminal-Bench 2.0, according to Poolside's comparison table.

Deployment and Availability

The model is released under Apache 2.0 license and supports deployment via vLLM (version 0.21.0+), SGLang, Transformers (v5.7.0+), and TensorRT-LLM. Quantized checkpoints are available in FP8 and NVFP4 formats.

Poolside provides a terminal-based coding agent called "pool" that integrates with the model and supports the Agent Client Protocol. The tool auto-configures with Zed and JetBrains editors.

All benchmarking used Poolside's pool agent harness with maximum 500 steps and sandboxed execution. Tasks ran in 8GB RAM/2 CPU sandboxes, except Terminal-Bench 2.0 which used 48GB RAM/32 CPUs. Poolside states they ran a reward-hack judge post-hoc and "did not find significant reward hacking."

What This Means

Laguna M.1 represents a competitive open-weight option for agentic coding tasks, though it trails frontier models like DeepSeek-V4 Flash on SWE-bench benchmarks by 4-11 percentage points depending on the task. The Apache 2.0 license and support for standard serving frameworks lower deployment barriers compared to proprietary alternatives. The 225B total parameter count with 23B active parameters positions it between smaller dense models and larger MoE architectures in terms of inference cost, though actual performance-per-dollar will depend on hardware and optimization.

Related Articles

model release

Mistral Releases Mistral 3 Family: 675B-Parameter Large 3 MoE and Three Edge Models Under Apache 2.0

Mistral has released Mistral 3, including Mistral Large 3—a sparse mixture-of-experts model with 41B active and 675B total parameters—and three Ministral 3 edge models (3B, 8B, 14B). All models are released under Apache 2.0 license with multimodal capabilities and are available today on multiple platforms.

model release

Cohere releases North Mini Code, a 30B-parameter sparse MoE coding model with 256K context window, free on OpenRouter

Cohere has released North Mini Code, the first model in its North family and its first agentic coding model. The sparse mixture-of-experts architecture features 30B total parameters with 3B active, a 256K-token context window, and up to 64K tokens of output, available free via OpenRouter under Apache 2.0 license.

model release

Zhipu AI releases GLM-5.2 with 1M token context and 62.1% SWE-bench Pro score

Zhipu AI released GLM-5.2, a 753 billion parameter model with a 1 million token context window. The model scores 62.1% on SWE-bench Pro and introduces IndexShare architecture that reduces per-token FLOPs by 2.9× at 1M context length. Released under MIT license with no regional restrictions.

model release

Microsoft Releases FastContext-1.0: 4B-Parameter Repository Explorer Cuts Coding Agent Token Use by 60%

Microsoft released FastContext-1.0, a lightweight repository-exploration subagent for LLM coding agents spanning 4B to 30B parameters. The model reduced main-agent token consumption by up to 60% while improving end-to-end resolution rates by up to 5.5% on SWE-bench Pro when integrated with agents like GPT-5.4 and GLM-5.1.

Comments

Loading...