Reka releases Reka Edge, a 7B multimodal model for efficient image and video understanding
Reka has released Reka Edge, a 7-billion parameter multimodal model designed for efficient image and video understanding. The model features a 16,384 token context window and is priced at $0.20 per million input and output tokens.
Reka Edge — Quick Specs
Reka has released Reka Edge, a 7-billion parameter multimodal vision-language model optimized for efficient performance on image understanding, video analysis, object detection, and agentic tool-use tasks.
Model Specifications
Reka Edge is a multimodal model that accepts image, video, and text inputs while generating text outputs. The model features:
- Parameter count: 7 billion
- Context window: 16,384 tokens
- Input pricing: $0.20 per million tokens
- Output pricing: $0.20 per million tokens
- Release date: March 20, 2026
Capabilities
According to the model documentation, Reka Edge is optimized for:
- Image understanding and analysis
- Video analysis and processing
- Object detection
- Agentic tool-use applications
The model is designed as an efficiency-focused alternative, with its 7B parameter count positioning it as a lightweight option compared to larger multimodal models.
Availability
Reka Edge is available through OpenRouter, which routes requests across multiple providers and handles normalization of requests and responses. The platform provides API access with support for various SDKs and frameworks.
The model's pricing structure—identical at $0.20 per million tokens for both input and output—simplifies cost calculation for deployment scenarios.
What this means
Reka Edge represents a focus on efficient multimodal inference, targeting use cases where model size and inference cost matter as much as capability. With a 7B parameter count and $0.20/M pricing, it positions itself as accessible for developers building vision-language applications with moderate compute constraints. The model's emphasis on video analysis and object detection suggests it's engineered for specific computer vision tasks rather than general-purpose multimodal reasoning.
Related Articles
Stable Video 4D 2.0 generates 4D assets from single videos with improved quality
Stability AI has released Stable Video 4D 2.0 (SV4D 2.0), an upgraded version of its multi-view video diffusion model designed to generate 4D assets from single object-centric videos. The update claims to deliver higher-quality outputs on real-world video footage.
OpenAI releases GPT-4o mini with 128K context at $0.15/$0.60 per 1M tokens
OpenAI released GPT-4o mini on July 18, 2024, a compact multimodal model with 128,000 token context window priced at $0.15 per million input tokens and $0.60 per million output tokens. The model achieves 82% on MMLU and claims to rank higher than GPT-4 on chat preference leaderboards while costing 60% less than GPT-3.5 Turbo.
AI2 releases MolmoWeb, open web agent matching proprietary systems with 8B parameters
The Allen Institute for AI has released MolmoWeb, a fully open web agent that operates websites using only screenshots without access to source code. The 8B-parameter model achieves 78.2% success on WebVoyager—nearly matching OpenAI's o3 at 79.3%—while being trained on one of the largest public web task datasets ever released.
Google launches Lyria 3 Pro, extending AI music generation to 3-minute tracks
Google announced Lyria 3 Pro, an upgraded music generation model capable of creating tracks up to three minutes long—a tenfold increase from Lyria 3's 30-second maximum. The model adds structural music understanding (intros, verses, choruses, bridges) and rolls out to Gemini app paid subscribers, Google Vids, ProducerAI, and enterprise tools including Vertex AI, the Gemini API, and AI Studio.
Comments
Loading...