Poolside releases Laguna XS.2: 33B parameter MoE coding model with 131K context window
Poolside has released Laguna XS.2, a 33B total parameter Mixture-of-Experts model with 3B activated parameters per token, designed for agentic coding. The model features a 131,072-token context window, scores 68.2% on SWE-bench Verified, and is available under Apache 2.0 license with free API access.
Poolside releases Laguna XS.2: 33B parameter MoE coding model with 131K context window
Poolside has released Laguna XS.2, a 33B total parameter Mixture-of-Experts model with 3B activated parameters per token, designed for agentic coding and long-horizon work on local machines.
Model specifications
Laguna XS.2 uses Sliding Window Attention with per-head gating in 30 of its 40 layers. The architecture includes:
- Total parameters: 33B with 3B activated per token
- Context window: 131,072 tokens
- Experts: 256 experts with 1 shared expert
- Architecture: 40 layers (10 global attention, 30 sliding window attention)
- Sliding window: 512 tokens
- Training: Muon optimizer, includes pre-training, post-training, and reinforcement learning stages
- License: Apache 2.0
The model uses FP8-quantized KV cache to reduce memory requirements and supports native reasoning with interleaved thinking between tool calls.
Benchmark performance
According to Poolside, Laguna XS.2 achieves the following scores:
- SWE-bench Verified: 68.2% (mean pass@1 over 4 runs)
- SWE-bench Multilingual: 62.4% (mean pass@1 over 7 runs)
- SWE-bench Pro: 44.5% (mean pass@1 over 3 runs)
- Terminal-Bench 2.0: 30.1% (mean pass@1 over 5 runs)
Poolside claims these scores place it competitively with Devstral Small 2 (24B dense, 68.0% on SWE-bench Verified) while using fewer activated parameters. Qwen3.6-35B-A3B leads the comparison group at 73.4% on SWE-bench Verified.
All benchmarking was completed using the Laude Institute's Harbor Framework with temperature=0.7 and top_k=20. The company notes some task images and verifiers were patched to fix infrastructure reliability issues.
Availability and deployment
Poolside is offering free API access to Laguna XS.2 and its larger 225B model, Laguna M.1, for a limited time. Pricing details have not been disclosed.
The model has launch-day support in:
- vLLM (pending PR merge)
- Transformers (support merged, shipping in release after v5.6.2)
- TRT-LLM (pending upstream PR)
- Ollama with MLX support
The company has released pool, a lightweight terminal-based coding agent that works with the model. Laguna XS.2 can run on a Mac with 36 GB of RAM according to Poolside.
What this means
Laguna XS.2 represents a push toward locally-runnable coding models with competitive benchmark performance. The 33B total parameter count with 3B activation makes it accessible for developers with high-end consumer hardware, while the Apache 2.0 license removes commercial restrictions. The 131K context window matches or exceeds many commercial models, potentially enabling longer coding sessions without context truncation. However, its 68.2% SWE-bench Verified score trails the latest Qwen and Claude models, suggesting it may be best suited for local development workflows where privacy and control outweigh absolute performance.
Related Articles
Xiaomi Releases MiMo-V2.5-Pro: 1.02T Parameter MoE Model with 1M Context Window
Xiaomi has released MiMo-V2.5-Pro, an open-source Mixture-of-Experts model with 1.02 trillion total parameters and 42 billion active parameters. The model supports up to 1 million tokens context length and claims 99.6% on GSM8K and 86.2% on MATH benchmarks.
Poolside Launches Laguna M.1, Free-Tier Coding Agent Model with 128K Context Window
Poolside has released Laguna M.1, its flagship coding agent model available for free on OpenRouter. The model features a 128K context window, up to 8K output tokens, and is optimized for agentic coding workflows with tool calling and reasoning capabilities.
Poolside releases Laguna XS.2, free fp8-quantized coding agent with 128K context
Poolside has released Laguna XS.2, the second-generation model in its XS size class for agentic coding workflows. The model offers 128K context window, up to 8K output tokens, and is quantized to fp8 for efficiency, available free via OpenRouter.
Xiaomi releases MiMo-V2.5: 310B parameter omnimodal model with 1M token context window
Xiaomi released MiMo-V2.5, a 310B total parameter sparse mixture-of-experts model that activates 15B parameters per token. The omnimodal model supports text, image, video, and audio understanding with a 1M token context window and was trained on 48T tokens using FP8 mixed precision.
Comments
Loading...