model release

InclusionAI releases Ling-2.6-flash: 104B parameter model with 7.4B active parameters, free on OpenRouter

TL;DR

InclusionAI has released Ling-2.6-flash, an instruction-tuned model with 104 billion total parameters and 7.4 billion active parameters, available free through OpenRouter. The model features a 262,144-token context window and is designed for agent workflows requiring fast responses and high token efficiency.

2 min read
0

InclusionAI releases Ling-2.6-flash: 104B parameter model with 7.4B active parameters, free on OpenRouter

InclusionAI has released Ling-2.6-flash, an instruction-tuned model with 104 billion total parameters and 7.4 billion active parameters. The model is available at no cost through OpenRouter as of April 21, 2026.

Model Specifications

Ling-2.6-flash features a 262,144-token context window (approximately 262K tokens) and is offered with $0 per million tokens for both input and output. The model uses a sparse architecture, activating only 7.4B of its 104B total parameters during inference.

According to inclusionAI, the model is designed for "real-world agents that require fast responses, strong execution, and high token efficiency." The company claims it delivers performance comparable to state-of-the-art models at similar scale while reducing token usage across coding, document processing, and lightweight agent workflows.

Technical Architecture

The model's sparse activation approach—using only 7.1% of its total parameters per inference—enables faster response times compared to fully-activated models of similar total parameter count. This design pattern follows recent trends in mixture-of-experts and sparse architectures.

The model is accessible through OpenRouter's unified API, which provides OpenAI-compatible endpoints. OpenRouter routes requests to available providers with automatic fallbacks for uptime optimization.

Availability

Ling-2.6-flash is currently available exclusively through OpenRouter's platform. The company has not disclosed whether the model will be released through other providers or made available for self-hosting. No benchmark scores have been published at this time.

InclusionAI is not among the previously established AI model providers tracked in industry databases, suggesting this is either a new entrant or an independent research team making their first public model release.

What This Means

The release of a free, high-parameter-count model with sparse activation represents competitive pressure on existing model providers. If performance claims are verified through independent benchmarks, the 262K context window at zero cost could make this attractive for agent applications and document processing tasks. However, without published benchmark scores or information about training data and capabilities, adoption will likely depend on real-world testing by developers. The sparse activation design (7.4B active from 104B total) suggests this is optimized for cost-efficient inference rather than maximum capability.

Related Articles

model release

Meta releases Llama Guard 4, a 12B parameter multimodal safety classifier with 164K context window

Meta has released Llama Guard 4, a 12-billion parameter content safety classifier derived from Llama 4 Scout. The model features a 163,840 token context window and can classify both text and image content, available free through OpenRouter with an August 31, 2024 knowledge cutoff.

model release

Alibaba Qwen Releases 35B Parameter Qwen3.6-35B-A3B Model with 262K Native Context Window

Alibaba Qwen has released Qwen3.6-35B-A3B, a 35-billion parameter mixture-of-experts model with 3 billion activated parameters and a 262,144-token native context window extendable to 1,010,000 tokens. The model scores 73.4 on SWE-bench Verified and features FP8 quantization with performance metrics nearly identical to the original model.

model release

Anthropic releases Claude Opus 4.7 with improved coding and vision, confirms it trails unreleased Mythos model

Anthropic released Claude Opus 4.7 with improved coding capabilities, higher-resolution vision, and a new reasoning level. The company publicly acknowledged the model underperforms its unreleased Mythos system, which remains restricted due to safety concerns.

model release

OpenAI releases GPT-5.4-Cyber, a cybersecurity-focused model limited to verified security professionals

OpenAI has released GPT-5.4-Cyber, a fine-tuned variant of GPT-5.4 built for defensive cybersecurity work including binary reverse engineering. Access is initially restricted to a few hundred verified security professionals, with expansion planned to thousands of individuals and hundreds of teams in coming weeks.

Comments

Loading...