IBM Releases Granite Embedding 311M R2 With 32K Context, 200+ Language Support
IBM released Granite Embedding 311M Multilingual R2, a 311-million parameter dense embedding model with 32,768-token context length and support for 200+ languages. The model scores 64.0 on Multilingual MTEB Retrieval (18 tasks), an 11.8-point improvement over its predecessor, and ships with ONNX and OpenVINO models for production deployment.
Granite Embedding 311M Multilingual R2 — Quick Specs
IBM Releases Granite Embedding 311M R2 With 32K Context, 200+ Language Support
IBM released Granite Embedding 311M Multilingual R2, a 311-million parameter dense embedding model that produces 768-dimensional vectors with a context length of 32,768 tokens. The model supports 200+ languages based on its multilingual pretraining corpus, with enhanced support for 52 languages and 9 programming languages that receive explicit retrieval-pair and cross-lingual training.
Performance Numbers
Granite Embedding 311M R2 scores 64.0 on Multilingual MTEB Retrieval across 18 tasks—an 11.8-point improvement over the previous granite-embedding-278m-multilingual model (52.2). Across all retrieval benchmarks, the model averages 56.0, representing a 14.2-point gain over the prior generation.
The model is built on the ModernBERT architecture, replacing the XLM-RoBERTa base used in R1. This architectural shift brings alternating attention mechanisms, GeGLU activations, and rotary position embeddings.
Technical Specifications
Context window: 32,768 tokens (up from 512 in R1)
Parameters: 311 million
Embedding dimensions: 768 (full), with Matryoshka support for 512, 384, 256, or 128 dimensions
Vocabulary size: 262,000 tokens covering 200+ languages and code
Architecture: ModernBERT bi-encoder
License: Apache 2.0
Release date: April 29, 2026
The 52 languages with enhanced retrieval support include major European, Asian, and Middle Eastern languages (Albanian, Arabic, Bengali, Chinese, English, French, German, Hindi, Japanese, Korean, Russian, Spanish, and others). Code retrieval is supported for Python, Go, Java, JavaScript, PHP, Ruby, SQL, C, and C++.
Training and Deployment
IBM developed the model using knowledge distillation from multiple teacher models, contrastive fine-tuning, and model merging. According to IBM, all training data uses permissive, enterprise-friendly licenses, including IBM-collected and IBM-generated datasets.
The model ships with ONNX and OpenVINO export formats and is compatible with vLLM and llama.cpp (GGUF). IBM also released a smaller 97-million parameter variant (granite-embedding-97m-multilingual-r2) with 384-dimensional embeddings for latency-sensitive deployments.
The extended 32K context window enables long-document and multi-passage retrieval tasks. Matryoshka dimension reduction allows developers to truncate embeddings from 768 to 128 dimensions with gradual performance degradation, reducing storage and memory requirements.
Model Family
The Granite Embedding R2 release includes four models:
- granite-embedding-311m-multilingual-r2 (768-dim, 200+ languages)
- granite-embedding-97m-multilingual-r2 (384-dim, 200+ languages)
- granite-embedding-english-r2 (English-optimized)
- granite-embedding-small-english-r2 (English-optimized, smaller)
IBM plans to publish a research paper in May 2026. The model is available on Hugging Face and works with the SentenceTransformer library and Hugging Face Transformers.
What This Means
The 32K context window positions Granite Embedding 311M R2 for long-document retrieval tasks that previously required chunking strategies. The 11.8-point MTEB improvement suggests meaningful gains in multilingual retrieval quality, though independent verification of benchmark scores is pending. IBM's focus on permissive licensing and enterprise-grade deployment formats (ONNX, OpenVINO) targets production use cases where model provenance matters. The Matryoshka support provides a practical tradeoff between embedding quality and infrastructure cost.
Related Articles
IBM Releases Granite 4.1 30B With 131K Context Window and Enhanced Tool-Calling
IBM released Granite 4.1 30B, a 30-billion parameter instruction-following model with a 131,072 token context window. The model scores 80.16 on MMLU 5-shot and 88.41 on HumanEval pass@1, with enhanced tool-calling capabilities following OpenAI's function definition schema.
IBM Releases Granite 4.1 8B with 131K Context Window at $0.05/M Input Tokens
IBM has released Granite 4.1 8B, an 8-billion-parameter decoder-only language model with a 131,072-token context window. The model supports 12 languages and costs $0.05 per million input tokens and $0.10 per million output tokens, available under the Apache 2.0 license.
IBM releases Granite 4.1-8B with 131K context window and enhanced tool-calling capabilities
IBM has released Granite 4.1-8B, an 8-billion parameter long-context model with a 131,072-token context window. The model achieves 85.37% on HumanEval and 73.84% on MMLU 5-shot, with enhanced tool-calling capabilities reaching 68.27% on BFCL v3. Released under Apache 2.0 license, it supports 12 languages.
IBM Releases Granite Speech 4.1 2B: 2-Billion-Parameter Multilingual Speech Model with Non-Autoregressive Variant
IBM has released Granite Speech 4.1 2B, a 2-billion-parameter speech-language model trained on 174,000 hours of audio for automatic speech recognition and translation across English, French, German, Spanish, Portuguese, and Japanese. The model introduces a dual-head CTC encoder and includes variants for speaker attribution and a novel non-autoregressive architecture for higher throughput.
Comments
Loading...