IBM Releases Granite 4.1 8B with 131K Context Window at $0.05/M Input Tokens
IBM has released Granite 4.1 8B, an 8-billion-parameter decoder-only language model with a 131,072-token context window. The model supports 12 languages and costs $0.05 per million input tokens and $0.10 per million output tokens, available under the Apache 2.0 license.
Granite 4.1 8B — Quick Specs
IBM Releases Granite 4.1 8B with 131K Context Window at $0.05/M Input Tokens
IBM has released Granite 4.1 8B, an 8-billion-parameter decoder-only language model with a 131,072-token context window, priced at $0.05 per million input tokens and $0.10 per million output tokens.
Model Specifications
Granite 4.1 8B is a dense transformer model with 8 billion parameters, released on April 30, 2026. The model supports a context window of 131,072 tokens, positioning it in the long-context model category alongside competitors like Claude and GPT-4.
The model is distributed under the Apache 2.0 license, making it available for both commercial and research use without licensing restrictions.
Capabilities
Granite 4.1 8B targets enterprise use cases with several specific features:
- Tool calling: Implements OpenAI-compatible function calling for integration with external systems
- Code generation: Includes fill-in-the-middle support for code completion tasks
- RAG support: Designed for retrieval-augmented generation workflows
- Text processing: Handles summarization, classification, and extraction tasks
The model supports 12 languages: English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese.
Deployment
IBM is distributing Granite 4.1 8B through OpenRouter, which provides routing to multiple infrastructure providers. Model weights are publicly available for self-hosting.
The pricing structure of $0.05 per million input tokens and $0.10 per million output tokens places it in the mid-range pricing tier for models of this size, comparable to other 8B-parameter models from companies like Meta and Mistral.
What This Means
Granite 4.1 8B represents IBM's continued investment in open-source enterprise AI, offering a permissively licensed alternative to proprietary models. The 131K context window is notably large for an 8B-parameter model, though actual performance at that context length remains to be independently verified. The Apache 2.0 license and multi-language support make it a viable option for enterprises requiring on-premises deployment or specific regulatory compliance, particularly in the 12 supported language markets.
Related Articles
MiniMax Releases M3: 428B-Parameter Multimodal Model with 1M Context Window and 15× Decode Speedup
MiniMax has released M3, a multimodal model with approximately 428 billion parameters and 23 billion activated parameters. The model supports a 1 million token context window and uses MiniMax Sparse Attention to achieve 9× prefill and 15× decode speedups compared to its predecessor M2.
Google DeepMind releases Gemma 4 12B: encoder-free multimodal model runs on 16GB RAM
Google DeepMind has released Gemma 4 12B, a 12-billion parameter multimodal model that runs locally on laptops with 16GB of RAM. The model eliminates separate vision and audio encoders, processing raw inputs directly through its language model backbone under an Apache 2.0 license.
White House forces Anthropic to pull Fable 5 AI model after Amazon security report
Anthropic's Fable 5 AI model was pulled from public access Friday night after Amazon reported security vulnerabilities to the White House. The administration imposed export controls on Anthropic's Mythos-class models just days after the June 9 release.
Moonshot AI releases Kimi K2.7 Code with 1T parameters, 256K context window, 30% lower thinking token usage
Moonshot AI has released Kimi K2.7 Code, a 1 trillion parameter Mixture-of-Experts model designed for long-horizon coding tasks. The model features a 256K context window and reduces thinking token usage by approximately 30% compared to its predecessor K2.6.
Comments
Loading...