edge-ai

7 articles tagged with edge-ai

April 22, 2026
model release

Gemma 4 VLA runs locally on NVIDIA Jetson Orin Nano Super with 8GB RAM, autonomous webcam tool-calling

NVIDIA engineer Asier Arranz demonstrated Gemma 4 running as a vision-language agent (VLA) on a Jetson Orin Nano Super with 8GB RAM. The model autonomously decides when to access a webcam based on user queries, with no hardcoded triggers—performing speech-to-text, vision analysis, and text-to-speech entirely locally.

April 2, 2026
model releaseGoogle DeepMind

Google DeepMind releases Gemma 4: open models ranking #3 and #6 on Arena AI leaderboard

Google DeepMind released Gemma 4, a family of four open models ranging from 2B to 31B parameters, all licensed under Apache 2.0. The 31B dense model ranks #3 on Arena AI's text leaderboard and the 26B mixture-of-experts variant ranks #6, outperforming closed models significantly larger in size.

March 24, 2026
model releaseStability AI

Stability AI releases Stable Audio Open Small for on-device audio generation with Arm

Stability AI has open-sourced Stable Audio Open Small in partnership with Arm, a smaller and faster variant of its text-to-audio model designed for on-device deployment. The model maintains output quality and prompt adherence while reducing computational requirements for real-world edge deployment on devices powered by Arm's technology, which runs on 99% of smartphones globally.

March 23, 2026
product update

Multiverse Computing launches API portal for compressed AI models to reduce cloud dependence

Multiverse Computing, a Spanish startup, has launched a self-serve API portal giving developers direct access to compressed versions of models from OpenAI, Meta, DeepSeek, and Mistral AI. The move targets enterprises seeking to reduce cloud infrastructure dependence and lower compute costs through edge deployment. The company claims its HyperNova 60B 2602 model delivers faster responses at lower cost than the original OpenAI model it was derived from.

March 9, 2026
model release

IBM releases Granite 4.0 1B Speech: multilingual model for edge devices

IBM has released Granite 4.0 1B Speech, a 1 billion parameter multilingual speech model designed for edge deployment. The model supports multiple languages and is optimized for devices with limited computational resources.

February 24, 2026
model release

Liquid AI releases LFM2-24B-A2B, a 24B parameter mixture-of-experts model

Liquid AI has released LFM2-24B-A2B, a 24-billion parameter mixture-of-experts model designed for text generation and conversational tasks. The model supports nine languages including English, Arabic, Chinese, French, German, Japanese, Korean, Spanish, and Portuguese.

February 20, 2026
product update

Taalas serves Llama 3.1 8B at 17,000 tokens/second with custom silicon

Taalas, a new Canadian hardware startup, announced its first product: a custom silicon implementation of Meta's Llama 3.1 8B model running at 17,000 tokens/second. The startup uses aggressive quantization combining 3-bit and 6-bit parameters. The system is accessible via chatjimmy.ai.