model release

Poolside releases Laguna XS.2, free fp8-quantized coding agent with 128K context

TL;DR

Poolside has released Laguna XS.2, the second-generation model in its XS size class for agentic coding workflows. The model offers 128K context window, up to 8K output tokens, and is quantized to fp8 for efficiency, available free via OpenRouter.

1 min read
0

Poolside releases Laguna XS.2, free fp8-quantized coding agent with 128K context

Poolside has released Laguna XS.2, the second-generation model in its XS size class designed for agentic coding workflows. The model is available free on OpenRouter as of April 28, 2025.

Technical specifications

Laguna XS.2 offers a 131,072-token context window (128K) with up to 8K output tokens. The model is quantized to fp8 precision, optimizing for speed and cost efficiency in production environments.

According to Poolside, the model combines tool calling and reasoning capabilities within a compact footprint. The company describes it as part of their "efficient coding agent series."

Pricing and availability

The model is available at zero cost through OpenRouter:

  • Input: $0 per million tokens
  • Output: $0 per million tokens

OpenRouter routes requests across multiple providers with automatic fallbacks for uptime optimization.

Reasoning capabilities

Laguna XS.2 supports OpenRouter's reasoning parameter, allowing developers to access step-by-step thinking processes through the reasoning_details array in API responses. The model can preserve reasoning context across conversation turns when the complete reasoning_details are passed back in subsequent requests.

What this means

Poolside is positioning itself in the increasingly competitive coding agent market with a free, quantized model that prioritizes deployment efficiency over raw capability. The fp8 quantization represents a pragmatic trade-off—reduced precision for faster inference—targeting production workflows where cost and latency matter more than maximum accuracy. At 128K context, Laguna XS.2 can handle substantial codebases, though it remains to be seen how the XS size class compares to larger coding models like Claude 3.5 Sonnet or GPT-4 on complex refactoring tasks. The free tier may be a customer acquisition strategy, with Poolside likely planning premium tiers or enterprise offerings.

Related Articles

model release

Nex AGI Releases Nex-N2-Pro: 397B Parameter MoE Model With 262K Context, Available Free

Nex AGI has released Nex-N2-Pro, an agentic mixture-of-experts model with 397B total parameters and 17B active parameters. The model features a 262K token context window and is available free via OpenRouter's API.

model release

MiniMax Releases M3: 428B-Parameter Multimodal Model with 1M Context Window and 15× Decode Speedup

MiniMax has released M3, a multimodal model with approximately 428 billion parameters and 23 billion activated parameters. The model supports a 1 million token context window and uses MiniMax Sparse Attention to achieve 9× prefill and 15× decode speedups compared to its predecessor M2.

model release

Moonshot AI releases Kimi K2.7 Code with 1T parameters, 256K context window, 30% lower thinking token usage

Moonshot AI has released Kimi K2.7 Code, a 1 trillion parameter Mixture-of-Experts model designed for long-horizon coding tasks. The model features a 256K context window and reduces thinking token usage by approximately 30% compared to its predecessor K2.6.

model release

Apple releases AFM 3 lineup: 20B-parameter on-device model and cloud AI running on Google's Nvidia infrastructure

Apple announced five third-generation foundation models at WWDC26, headlined by AFM 3 Core Advanced—a 20-billion-parameter sparse model that runs on-device by activating only 1-4 billion parameters at a time. For the first time, Apple extended Private Cloud Compute to third-party infrastructure, with AFM 3 Cloud Pro running on Nvidia GPUs in Google Cloud.

Comments

Loading...