IBM Releases Granite 4.1 30B With 131K Context Window and Enhanced Tool-Calling
IBM released Granite 4.1 30B, a 30-billion parameter instruction-following model with a 131,072 token context window. The model scores 80.16 on MMLU 5-shot and 88.41 on HumanEval pass@1, with enhanced tool-calling capabilities following OpenAI's function definition schema.
IBM Releases Granite 4.1 30B With 131K Context Window and Enhanced Tool-Calling
IBM released Granite 4.1 30B, a 30-billion parameter instruction-following model with a 131,072 token context window. Released April 29, 2026 under Apache 2.0 license, the model is available on Hugging Face.
Benchmark Performance
The model scores 80.16 on MMLU 5-shot, 88.41 on HumanEval pass@1, and 85.45 on MBPP pass@1. On reasoning benchmarks, it achieves 83.74 on BBH 3-shot with chain-of-thought and 64.09 on MMLU-Pro 5-shot with CoT. The model scores 89.65 on IFEval (instruction following) and 71.02 on ArenaHard.
For math tasks, Granite 4.1 30B reaches 94.16 on GSM8K 8-shot and 81.93 on DeepMind Math 0-shot with CoT. Tool-calling capability scores 73.68 on BFCL v3.
Architecture and Training
The model uses a decoder-only dense transformer with grouped-query attention, RoPE positional embeddings, and SwiGLU activation. It has 64 layers, 32 attention heads (8 KV heads), 4,096 embedding size, and 32,768 MLP hidden size. Attention head size is 128.
IBM trained the model using "a combination of open source instruction datasets with permissive license and internally collected synthetic datasets," according to the model card. The post-training pipeline included supervised fine-tuning and reinforcement learning alignment.
Language and Safety
Granite 4.1 30B officially supports 12 languages: English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. It scores 73.71 on MMMLU 5-shot (11 languages) and 67.26 on INCLUDE 5-shot (14 languages).
On safety benchmarks, the model achieves 96.41 on SALAD-Bench, 85.76 on AttaQ, and 78.19 average on Tulu3 Safety Eval.
Tool-Calling Implementation
The model supports tool-calling using OpenAI's function definition schema. IBM's implementation uses XML tags (<tool_call>) to structure function calls with JSON objects containing function names and arguments. The model can integrate with external APIs and functions.
What This Means
Granite 4.1 30B provides open-source teams an Apache 2.0 licensed alternative with competitive performance on instruction-following and code generation tasks. The 131K context window and multilingual support position it for enterprise RAG applications. However, pricing for hosted inference is not yet disclosed, and the model's performance trails frontier models on advanced reasoning benchmarks like GPQA (45.76 vs. 50+ for leading models). The enhanced tool-calling capability and permissive license make it particularly relevant for commercial deployments requiring function integration.
Related Articles
Microsoft Releases FastContext-1.0: 4B-Parameter Repository Explorer Cuts Coding Agent Token Use by 60%
Microsoft released FastContext-1.0, a lightweight repository-exploration subagent for LLM coding agents spanning 4B to 30B parameters. The model reduced main-agent token consumption by up to 60% while improving end-to-end resolution rates by up to 5.5% on SWE-bench Pro when integrated with agents like GPT-5.4 and GLM-5.1.
MiniMax Releases M3: 428B-Parameter Multimodal Model with 1M Context Window and 15× Decode Speedup
MiniMax has released M3, a multimodal model with approximately 428 billion parameters and 23 billion activated parameters. The model supports a 1 million token context window and uses MiniMax Sparse Attention to achieve 9× prefill and 15× decode speedups compared to its predecessor M2.
Anthropic releases Claude Fable 5 with Mythos-class capabilities at $10/$50 per million tokens
Anthropic released Claude Fable 5, a Mythos-class model, to enterprise customers and paid subscribers two months after limiting its advanced Mythos model to select users. The new model costs $10 per million input tokens and $50 per million output tokens—twice the price of Claude Opus 4.8—and includes safeguards that block responses in high-risk areas like cybersecurity and biology.
Anthropic releases Claude Fable 5, a safety-limited version of Mythos, at $10/$50 per million tokens
Anthropic released Claude Fable 5, the first publicly available version of its Mythos model, with built-in safety restrictions that automatically block high-risk queries in cybersecurity, biology, chemistry, and related fields. The model costs $10 per million input tokens and $50 per million output tokens, double the price of Claude Opus 4.8.
Comments
Loading...