model releaseIbm

IBM releases Granite 4.1-8B with 131K context window and enhanced tool-calling capabilities

TL;DR

IBM has released Granite 4.1-8B, an 8-billion parameter long-context model with a 131,072-token context window. The model achieves 85.37% on HumanEval and 73.84% on MMLU 5-shot, with enhanced tool-calling capabilities reaching 68.27% on BFCL v3. Released under Apache 2.0 license, it supports 12 languages.

April 30, 2026 · 3:51 PM2 min read

Granite 4.1-8B — Quick Specs

Context window131K tokens

Compare Granite 4.1-8B with other models →

IBM releases Granite 4.1-8B with 131K context window and enhanced tool-calling capabilities

IBM has released Granite 4.1-8B, an 8-billion parameter instruction-following model with a 131,072-token context window. The model was released on April 29, 2025, under an Apache 2.0 license.

Performance benchmarks

According to IBM, Granite 4.1-8B achieves the following scores:

Code tasks: 85.37% on HumanEval pass@1, 87.30% on MBPP pass@1, 79.88% on HumanEval+ pass@1
General tasks: 73.84% on MMLU 5-shot, 80.51% on BBH 3-shot with chain-of-thought
Math tasks: 92.49% on GSM8K 8-shot, 80.10% on Minerva Math 0-shot with CoT
Tool-calling: 68.27% on BFCL v3
Alignment: 87.06% on IFEval average

The model is part of a three-model family including 3B and 30B parameter versions.

Technical specifications

Granite 4.1-8B uses a decoder-only dense transformer architecture with:

4,096 embedding size
40 layers
32 attention heads with 8 key-value heads
Grouped Query Attention (GQA)
RoPE positional embeddings
SwiGLU activation in MLP layers
12,800 MLP hidden size

Training and capabilities

IBM trained the model on a combination of open source instruction datasets with permissive licenses and internally generated synthetic data. The post-training pipeline included supervised fine-tuning and reinforcement learning alignment.

The model supports 12 languages: English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. According to IBM, it can be fine-tuned for additional languages.

Key capabilities include:

Text summarization and classification
Question-answering and RAG
Code generation and completion
Function calling with OpenAI-compatible tool definitions
Fill-in-the-middle code completions
Multilingual dialog

The model achieves 64.84% on MMMLU 5-shot across 11 languages and 58.89% on INCLUDE 5-shot across 14 languages.

Safety benchmarks

IBM reports safety scores of 95.80% on SALAD-Bench and 81.19% on AttaQ for the 8B model.

Availability

The model is available on Hugging Face under the Apache 2.0 license. Pricing for API access has not been disclosed. IBM provides code examples for both basic text generation and tool-calling use cases using the Transformers library.

What this means

Granite 4.1-8B represents IBM's push into the competitive 8B parameter model space with strong code performance and a notably large 131K context window. The Apache 2.0 license and multilingual support position it as an alternative to models like Llama 3.1 8B and Mistral 7B for enterprises requiring permissive licensing. The tool-calling improvements and comprehensive benchmark suite suggest IBM is targeting production AI assistant deployments, though actual inference costs and API availability remain unclear.

Source: huggingface.co ↗

IBM Granite open-source code-generation multilingual tool-calling Apache-2.0

model releaseApril 29, 2026

IBM's Granite 4.1: 8B Dense Model Matches 32B MoE Performance on 15T Tokens

IBM released Granite 4.1, a family of dense decoder-only LLMs (3B, 8B, 30B parameters) trained on approximately 15 trillion tokens using a five-phase pre-training pipeline. The 8B instruct model matches or surpasses the previous Granite 4.0-H-Small (32B-A9B MoE) despite using fewer parameters and a simpler dense architecture. All models support up to 512K context windows and are released under Apache 2.0 license.

model releaseApril 28, 2026

Poolside releases Laguna XS.2: 33B parameter MoE coding model with 131K context window

Poolside has released Laguna XS.2, a 33B total parameter Mixture-of-Experts model with 3B activated parameters per token, designed for agentic coding. The model features a 131,072-token context window, scores 68.2% on SWE-bench Verified, and is available under Apache 2.0 license with free API access.

product updateApril 28, 2026

IBM releases Bob AI coding assistant after testing on 80,000 employees, claims 45% productivity gains

IBM has launched Bob, its AI coding assistant, following internal testing with 80,000 employees. The company claims teams saw average productivity gains of 45% across complex workflows. Pricing ranges from $20 to $200 per month using a "Bobcoin" credit system.

model releaseApril 28, 2026

Poolside Launches Laguna M.1, Free-Tier Coding Agent Model with 128K Context Window

Poolside has released Laguna M.1, its flagship coding agent model available for free on OpenRouter. The model features a 128K context window, up to 8K output tokens, and is optimized for agentic coding workflows with tool calling and reasoning capabilities.

IBM releases Granite 4.1-8B with 131K context window and enhanced tool-calling capabilities

Granite 4.1-8B — Quick Specs

IBM releases Granite 4.1-8B with 131K context window and enhanced tool-calling capabilities

Performance benchmarks

Technical specifications

Training and capabilities

Safety benchmarks

Availability

What this means

Related Articles

IBM's Granite 4.1: 8B Dense Model Matches 32B MoE Performance on 15T Tokens

Poolside releases Laguna XS.2: 33B parameter MoE coding model with 131K context window

IBM releases Bob AI coding assistant after testing on 80,000 employees, claims 45% productivity gains

Poolside Launches Laguna M.1, Free-Tier Coding Agent Model with 128K Context Window

Comments