model release

IBM releases Granite 4.0 1B Speech: multilingual model for edge devices

TL;DR

IBM has released Granite 4.0 1B Speech, a 1 billion parameter multilingual speech model designed for edge deployment. The model supports multiple languages and is optimized for devices with limited computational resources.

March 9, 2026 · 6:50 PM2 min read

IBM Releases Granite 4.0 1B Speech Model for Edge Devices

IBM has released Granite 4.0 1B Speech, a 1 billion parameter multilingual speech recognition model designed for edge deployment. The model targets scenarios where computational resources are constrained and low-latency inference is critical.

Model Specifications

Granite 4.0 1B Speech contains 1 billion parameters and supports multiple languages, making it suitable for global applications. The model is optimized for edge devices, enabling on-device speech processing without reliance on cloud infrastructure.

Key Features

The model's compact size allows deployment on edge hardware with limited memory and compute capacity. IBM positions the release as part of its Granite model family, which includes text and multimodal variants.

The multilingual capability addresses a common limitation of speech models optimized solely for English. This approach reduces latency and improves privacy by processing audio locally rather than transmitting it to remote servers.

Distribution and Access

Granite 4.0 1B Speech is available through Hugging Face Model Hub, making it accessible to the broader AI development community. IBM has not disclosed licensing restrictions or commercial use terms.

Context

Compact speech models have become increasingly important as edge AI deployment grows. Unlike large language models, speech models face unique constraints: they must process streaming audio in real-time while maintaining accuracy across multiple languages.

IBM's focus on the 1 billion parameter scale reflects a market demand for models that balance capability and deployability. Many edge applications cannot accommodate multi-billion parameter models due to hardware limitations.

What This Means

Granite 4.0 1B Speech represents IBM's continued investment in edge AI infrastructure. For developers building voice applications for resource-constrained environments—IoT devices, smartphones, embedded systems—the multilingual support and compact footprint reduce the need for custom model training. The Hugging Face release signals IBM's intent to compete in the open-source speech model space, where previous dominance belonged to academia-led projects. However, the model's capability parity with existing speech models remains unverified through published benchmarks.

Source: huggingface.co ↗

speech-recognition edge-ai multilingual ibm-granite model-release speech-models

model releaseMay 8, 2026

Tencent Releases Hy3 Preview: Mixture-of-Experts Model with 262K Context and Configurable Reasoning

Tencent has released Hy3 preview, a Mixture-of-Experts model with a 262,144 token context window priced at $0.066 per million input tokens and $0.26 per million output tokens. The model features three configurable reasoning modes—disabled, low, and high—designed for agentic workflows and production environments.

model releaseMay 7, 2026

Google releases Gemini 3.1 Flash Lite with 1M context at $0.25 per million input tokens

Google has released Gemini 3.1 Flash Lite, a high-efficiency multimodal model with a 1,048,576 token context window priced at $0.25 per million input tokens and $1.50 per million output tokens. The model supports text, image, video, audio, and PDF inputs with four thinking levels for cost-performance optimization.

model releaseMay 6, 2026

IBM Releases Granite Embedding 311M R2 With 32K Context, 200+ Language Support

IBM released Granite Embedding 311M Multilingual R2, a 311-million parameter dense embedding model with 32,768-token context length and support for 200+ languages. The model scores 64.0 on Multilingual MTEB Retrieval (18 tasks), an 11.8-point improvement over its predecessor, and ships with ONNX and OpenVINO models for production deployment.

model releaseMay 5, 2026

IBM Releases Granite Speech 4.1 2B: 2-Billion-Parameter Multilingual Speech Model with Non-Autoregressive Variant

IBM has released Granite Speech 4.1 2B, a 2-billion-parameter speech-language model trained on 174,000 hours of audio for automatic speech recognition and translation across English, French, German, Spanish, Portuguese, and Japanese. The model introduces a dual-head CTC encoder and includes variants for speaker attribution and a novel non-autoregressive architecture for higher throughput.