model releaseIbm

IBM releases Granite 4.1-8B with 131K context window and enhanced tool-calling capabilities

TL;DR

IBM has released Granite 4.1-8B, an 8-billion parameter long-context model with a 131,072-token context window. The model achieves 85.37% on HumanEval and 73.84% on MMLU 5-shot, with enhanced tool-calling capabilities reaching 68.27% on BFCL v3. Released under Apache 2.0 license, it supports 12 languages.

2 min read
0

IBM releases Granite 4.1-8B with 131K context window and enhanced tool-calling capabilities

IBM has released Granite 4.1-8B, an 8-billion parameter instruction-following model with a 131,072-token context window. The model was released on April 29, 2025, under an Apache 2.0 license.

Performance benchmarks

According to IBM, Granite 4.1-8B achieves the following scores:

  • Code tasks: 85.37% on HumanEval pass@1, 87.30% on MBPP pass@1, 79.88% on HumanEval+ pass@1
  • General tasks: 73.84% on MMLU 5-shot, 80.51% on BBH 3-shot with chain-of-thought
  • Math tasks: 92.49% on GSM8K 8-shot, 80.10% on Minerva Math 0-shot with CoT
  • Tool-calling: 68.27% on BFCL v3
  • Alignment: 87.06% on IFEval average

The model is part of a three-model family including 3B and 30B parameter versions.

Technical specifications

Granite 4.1-8B uses a decoder-only dense transformer architecture with:

  • 4,096 embedding size
  • 40 layers
  • 32 attention heads with 8 key-value heads
  • Grouped Query Attention (GQA)
  • RoPE positional embeddings
  • SwiGLU activation in MLP layers
  • 12,800 MLP hidden size

Training and capabilities

IBM trained the model on a combination of open source instruction datasets with permissive licenses and internally generated synthetic data. The post-training pipeline included supervised fine-tuning and reinforcement learning alignment.

The model supports 12 languages: English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. According to IBM, it can be fine-tuned for additional languages.

Key capabilities include:

  • Text summarization and classification
  • Question-answering and RAG
  • Code generation and completion
  • Function calling with OpenAI-compatible tool definitions
  • Fill-in-the-middle code completions
  • Multilingual dialog

The model achieves 64.84% on MMMLU 5-shot across 11 languages and 58.89% on INCLUDE 5-shot across 14 languages.

Safety benchmarks

IBM reports safety scores of 95.80% on SALAD-Bench and 81.19% on AttaQ for the 8B model.

Availability

The model is available on Hugging Face under the Apache 2.0 license. Pricing for API access has not been disclosed. IBM provides code examples for both basic text generation and tool-calling use cases using the Transformers library.

What this means

Granite 4.1-8B represents IBM's push into the competitive 8B parameter model space with strong code performance and a notably large 131K context window. The Apache 2.0 license and multilingual support position it as an alternative to models like Llama 3.1 8B and Mistral 7B for enterprises requiring permissive licensing. The tool-calling improvements and comprehensive benchmark suite suggest IBM is targeting production AI assistant deployments, though actual inference costs and API availability remain unclear.

Related Articles

model release

IBM's Granite 4.1: 8B Dense Model Matches 32B MoE Performance on 15T Tokens

IBM released Granite 4.1, a family of dense decoder-only LLMs (3B, 8B, 30B parameters) trained on approximately 15 trillion tokens using a five-phase pre-training pipeline. The 8B instruct model matches or surpasses the previous Granite 4.0-H-Small (32B-A9B MoE) despite using fewer parameters and a simpler dense architecture. All models support up to 512K context windows and are released under Apache 2.0 license.

model release

Poolside releases Laguna XS.2: 33B parameter MoE coding model with 131K context window

Poolside has released Laguna XS.2, a 33B total parameter Mixture-of-Experts model with 3B activated parameters per token, designed for agentic coding. The model features a 131,072-token context window, scores 68.2% on SWE-bench Verified, and is available under Apache 2.0 license with free API access.

product update

IBM releases Bob AI coding assistant after testing on 80,000 employees, claims 45% productivity gains

IBM has launched Bob, its AI coding assistant, following internal testing with 80,000 employees. The company claims teams saw average productivity gains of 45% across complex workflows. Pricing ranges from $20 to $200 per month using a "Bobcoin" credit system.

model release

Poolside Launches Laguna M.1, Free-Tier Coding Agent Model with 128K Context Window

Poolside has released Laguna M.1, its flagship coding agent model available for free on OpenRouter. The model features a 128K context window, up to 8K output tokens, and is optimized for agentic coding workflows with tool calling and reasoning capabilities.

Comments

Loading...