model release

OpenRouter Releases Elephant Alpha: 100B-Parameter Model with 256K Context Window and Free Pricing

TL;DR

OpenRouter has released Elephant Alpha, a 100B-parameter text model with a 256K context window and 32K output token limit. The model is available at no cost through OpenRouter's platform, supporting function calling, structured output, and prompt caching.

April 13, 2026 · 3:36 PM2 min read

Elephant Alpha — Quick Specs

Context window262K tokens

Compare Elephant Alpha with other models →

OpenRouter Releases Elephant Alpha: 100B-Parameter Model with 256K Context Window and Free Pricing

OpenRouter has released Elephant Alpha, a 100B-parameter text model designed for "intelligence efficiency" with a 256K context window and support for up to 32K output tokens. The model is available at $0 per million tokens for both input and output through OpenRouter's routing platform.

Technical Specifications

Elephant Alpha features:

100 billion parameters
262,144 (256K) token context window
32,768 (32K) maximum output tokens
Function calling support
Structured output capabilities
Prompt caching
Released April 13, 2025

According to OpenRouter, the model focuses on "delivering strong reasoning performance while minimizing token usage," though specific benchmark scores have not been disclosed.

Target Use Cases

OpenRouter positions Elephant Alpha for three primary applications:

Code completion and debugging
Rapid document processing
Lightweight agent interactions

The model is available through OpenRouter's unified API, which routes requests across multiple providers with automatic fallbacks. OpenRouter notes that prompts and completions may be logged by the provider and used for model improvement.

Pricing and Access

The model is currently available at zero cost through OpenRouter's platform, with no charges for input or output tokens. This pricing is managed through OpenRouter's routing system, which normalizes requests and responses across providers.

The model supports OpenAI-compatible API calls and can be accessed through the OpenAI SDK as well as various third-party SDKs and frameworks.

What This Means

Elephant Alpha enters a crowded field of large language models with a distinctive positioning around "intelligence efficiency" and a notably large context window at 256K tokens. The free pricing through OpenRouter makes it accessible for experimentation, though the lack of published benchmarks makes it difficult to assess performance claims against established models. The 32K output token limit is substantially higher than many competing models, which could be useful for document generation tasks. However, the data logging policy and absence of performance metrics warrant careful evaluation for production deployments.

Source: openrouter.ai ↗

OpenRouter Elephant Alpha LLM 100B parameters 256K context free pricing function calling

model releaseMay 29, 2026

StepFun launches Step 3.7 Flash: 196B MoE model with 256K context and adjustable reasoning levels at $0.20/$1.15 per 1M

StepFun has released Step 3.7 Flash, a 196B-parameter Mixture-of-Experts model that activates approximately 11B parameters per token. The multimodal model supports a 256K context window and introduces selectable reasoning levels (high/medium/low), priced at $0.20 per 1M input tokens and $1.15 per 1M output tokens.

model releaseMay 29, 2026

StepFun releases Step-3.7-Flash: 198B-parameter MoE model with 256K context at $0.20/M input tokens

StepFun has released Step-3.7-Flash, a 198B-parameter sparse Mixture-of-Experts vision-language model that activates 11B parameters per token and delivers up to 400 tokens per second. The model supports a 256K context window, three selectable reasoning levels, and is priced at $0.20 per million input tokens (cache miss) and $1.15 per million output tokens.

model releaseMay 29, 2026

Liquid AI Releases LFM2.5-8B: 8-Billion Parameter Hybrid Model Optimized for Edge Deployment

Liquid AI has released LFM2.5-8B-A1B, an 8-billion parameter hybrid model designed specifically for edge AI and on-device deployment. The model is available in multiple GGUF quantized formats ranging from 4-bit (4.84 GB) to 16-bit (16.9 GB), optimized for memory efficiency.

model releaseMay 28, 2026

Anthropic's Opus 4.8 matches Claude Mythos Preview in alignment, cuts thinking mode costs by 67%

Anthropic released Claude Opus 4.8 on May 28, 2026, replacing Opus 4.7 at unchanged pricing. The company claims the model's misalignment rates match those of Claude Mythos Preview, the experimental model deemed too dangerous for public release in April 2026. Opus 4.8 delivers faster thinking modes at one-third the cost of version 4.7.

OpenRouter Releases Elephant Alpha: 100B-Parameter Model with 256K Context Window and Free Pricing

Elephant Alpha — Quick Specs

OpenRouter Releases Elephant Alpha: 100B-Parameter Model with 256K Context Window and Free Pricing

Technical Specifications

Target Use Cases

Pricing and Access

What This Means

Related Articles

StepFun launches Step 3.7 Flash: 196B MoE model with 256K context and adjustable reasoning levels at $0.20/$1.15 per 1M

StepFun releases Step-3.7-Flash: 198B-parameter MoE model with 256K context at $0.20/M input tokens

Liquid AI Releases LFM2.5-8B: 8-Billion Parameter Hybrid Model Optimized for Edge Deployment

Anthropic's Opus 4.8 matches Claude Mythos Preview in alignment, cuts thinking mode costs by 67%

Comments