model releaseIbm

IBM Releases Granite 4.1 8B with 131K Context Window at $0.05/M Input Tokens

TL;DR

IBM has released Granite 4.1 8B, an 8-billion-parameter decoder-only language model with a 131,072-token context window. The model supports 12 languages and costs $0.05 per million input tokens and $0.10 per million output tokens, available under the Apache 2.0 license.

April 30, 2026 · 7:36 PM2 min read

Granite 4.1 8B — Quick Specs

Context window131K tokens

Input$0.05/1M tokens

Output$0.1/1M tokens

Compare Granite 4.1 8B with other models →

IBM Releases Granite 4.1 8B with 131K Context Window at $0.05/M Input Tokens

IBM has released Granite 4.1 8B, an 8-billion-parameter decoder-only language model with a 131,072-token context window, priced at $0.05 per million input tokens and $0.10 per million output tokens.

Model Specifications

Granite 4.1 8B is a dense transformer model with 8 billion parameters, released on April 30, 2026. The model supports a context window of 131,072 tokens, positioning it in the long-context model category alongside competitors like Claude and GPT-4.

The model is distributed under the Apache 2.0 license, making it available for both commercial and research use without licensing restrictions.

Capabilities

Granite 4.1 8B targets enterprise use cases with several specific features:

Tool calling: Implements OpenAI-compatible function calling for integration with external systems
Code generation: Includes fill-in-the-middle support for code completion tasks
RAG support: Designed for retrieval-augmented generation workflows
Text processing: Handles summarization, classification, and extraction tasks

The model supports 12 languages: English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese.

Deployment

IBM is distributing Granite 4.1 8B through OpenRouter, which provides routing to multiple infrastructure providers. Model weights are publicly available for self-hosting.

The pricing structure of $0.05 per million input tokens and $0.10 per million output tokens places it in the mid-range pricing tier for models of this size, comparable to other 8B-parameter models from companies like Meta and Mistral.

What This Means

Granite 4.1 8B represents IBM's continued investment in open-source enterprise AI, offering a permissively licensed alternative to proprietary models. The 131K context window is notably large for an 8B-parameter model, though actual performance at that context length remains to be independently verified. The Apache 2.0 license and multi-language support make it a viable option for enterprises requiring on-premises deployment or specific regulatory compliance, particularly in the 12 supported language markets.

Source: openrouter.ai ↗

IBM Granite open-source Apache 2.0 long-context enterprise AI multilingual

model releaseApril 30, 2026

IBM releases Granite 4.1-8B with 131K context window and enhanced tool-calling capabilities

IBM has released Granite 4.1-8B, an 8-billion parameter long-context model with a 131,072-token context window. The model achieves 85.37% on HumanEval and 73.84% on MMLU 5-shot, with enhanced tool-calling capabilities reaching 68.27% on BFCL v3. Released under Apache 2.0 license, it supports 12 languages.

model releaseApril 29, 2026

IBM's Granite 4.1: 8B Dense Model Matches 32B MoE Performance on 15T Tokens

IBM released Granite 4.1, a family of dense decoder-only LLMs (3B, 8B, 30B parameters) trained on approximately 15 trillion tokens using a five-phase pre-training pipeline. The 8B instruct model matches or surpasses the previous Granite 4.0-H-Small (32B-A9B MoE) despite using fewer parameters and a simpler dense architecture. All models support up to 512K context windows and are released under Apache 2.0 license.

product updateApril 28, 2026

IBM releases Bob AI coding assistant after testing on 80,000 employees, claims 45% productivity gains

IBM has launched Bob, its AI coding assistant, following internal testing with 80,000 employees. The company claims teams saw average productivity gains of 45% across complex workflows. Pricing ranges from $20 to $200 per month using a "Bobcoin" credit system.

model releaseApril 28, 2026

Xiaomi releases MiMo-V2.5: 310B parameter omnimodal model with 1M token context window

Xiaomi released MiMo-V2.5, a 310B total parameter sparse mixture-of-experts model that activates 15B parameters per token. The omnimodal model supports text, image, video, and audio understanding with a 1M token context window and was trained on 48T tokens using FP8 mixed precision.

IBM Releases Granite 4.1 8B with 131K Context Window at $0.05/M Input Tokens

Granite 4.1 8B — Quick Specs

IBM Releases Granite 4.1 8B with 131K Context Window at $0.05/M Input Tokens

Model Specifications

Capabilities

Deployment

What This Means

Related Articles

IBM releases Granite 4.1-8B with 131K context window and enhanced tool-calling capabilities

IBM's Granite 4.1: 8B Dense Model Matches 32B MoE Performance on 15T Tokens

IBM releases Bob AI coding assistant after testing on 80,000 employees, claims 45% productivity gains

Xiaomi releases MiMo-V2.5: 310B parameter omnimodal model with 1M token context window

Comments