IBM releases Apache 2.0 Granite 4.1 LLMs in 3B, 8B, and 30B sizes
IBM has released the Granite 4.1 family of language models under Apache 2.0 license. The models come in 3B, 8B, and 30B parameter sizes. Unsloth has released 21 GGUF quantized variants of the 3B model ranging from 1.2GB to 6.34GB.
IBM releases Apache 2.0 Granite 4.1 LLMs in 3B, 8B, and 30B sizes
IBM has released the Granite 4.1 family of language models under Apache 2.0 license. The models are available in three sizes: 3B, 8B, and 30B parameters.
Model availability and quantization
Unsloth released 21 GGUF quantized variants of the 3B model on Hugging Face. The quantized files range from 1.2GB to 6.34GB in size, with the full collection totaling 51.3GB. GGUF encoding allows the models to run on consumer hardware with reduced memory requirements.
Training details
Granite team member Yousaf Shah published a detailed description of the training process in "Granite 4.1 LLMs: How They're Built" on the Hugging Face blog. The post covers the technical architecture and training methodology used for the model family.
Model performance
An informal test of the 3B model's SVG generation capabilities across different quantization levels showed inconsistent results. A benchmark test prompting all 21 quantized variants to "Generate an SVG of a pelican riding a bicycle" revealed no clear correlation between model size and output quality. All variants produced abstract shapes rather than recognizable images, suggesting the model was not specifically trained for visual generation tasks.
What this means
The Apache 2.0 license makes Granite 4.1 commercially deployable without restrictions, positioning it as an alternative to models with more restrictive licenses. However, the availability of 21 quantized variants demonstrates the tradeoff space between model size and deployment flexibility. The lack of visual generation capability indicates these models are focused on text processing rather than multimodal tasks, despite being able to output SVG markup. Organizations evaluating Granite 4.1 should test it on their specific use cases rather than assume capabilities based on parameter count alone.
Related Articles
Mistral Releases Mistral 3 Family: 675B-Parameter Large 3 MoE and Three Edge Models Under Apache 2.0
Mistral has released Mistral 3, including Mistral Large 3—a sparse mixture-of-experts model with 41B active and 675B total parameters—and three Ministral 3 edge models (3B, 8B, 14B). All models are released under Apache 2.0 license with multimodal capabilities and are available today on multiple platforms.
Mistral releases Leanstral, open-source 6B-parameter proof assistant for Lean 4 under Apache 2.0
Mistral AI has released Leanstral, a sparse 120B model with 6B active parameters designed specifically for the Lean 4 proof assistant. The model is available under Apache 2.0 license with free API access and achieves a 26.3 FLTEval score at pass@2, outperforming Claude Sonnet 4.6 while costing $36 versus $549.
Zhipu AI releases GLM-5.2 with 1M token context and 62.1% SWE-bench Pro score
Zhipu AI released GLM-5.2, a 753 billion parameter model with a 1 million token context window. The model scores 62.1% on SWE-bench Pro and introduces IndexShare architecture that reduces per-token FLOPs by 2.9× at 1M context length. Released under MIT license with no regional restrictions.
NVIDIA Releases Quantized DiffusionGemma 26B: 1,100+ Tokens/Second with 256K Context Window
NVIDIA released a quantized version of Google DeepMind's DiffusionGemma 26B A4B IT, a multimodal model with 25.2B total parameters (3.8B active) that processes text, image, and video inputs. The NVFP4-quantized model achieves generation speeds exceeding 1,100 tokens per second on NVIDIA H100 GPUs while supporting a 256K token context window.
Comments
Loading...