Reka releases Reka Edge, a 7B multimodal model for efficient image and video understanding
Reka has released Reka Edge, a 7-billion parameter multimodal model designed for efficient image and video understanding. The model features a 16,384 token context window and is priced at $0.20 per million input and output tokens.
Reka Edge — Quick Specs
Reka has released Reka Edge, a 7-billion parameter multimodal vision-language model optimized for efficient performance on image understanding, video analysis, object detection, and agentic tool-use tasks.
Model Specifications
Reka Edge is a multimodal model that accepts image, video, and text inputs while generating text outputs. The model features:
- Parameter count: 7 billion
- Context window: 16,384 tokens
- Input pricing: $0.20 per million tokens
- Output pricing: $0.20 per million tokens
- Release date: March 20, 2026
Capabilities
According to the model documentation, Reka Edge is optimized for:
- Image understanding and analysis
- Video analysis and processing
- Object detection
- Agentic tool-use applications
The model is designed as an efficiency-focused alternative, with its 7B parameter count positioning it as a lightweight option compared to larger multimodal models.
Availability
Reka Edge is available through OpenRouter, which routes requests across multiple providers and handles normalization of requests and responses. The platform provides API access with support for various SDKs and frameworks.
The model's pricing structure—identical at $0.20 per million tokens for both input and output—simplifies cost calculation for deployment scenarios.
What this means
Reka Edge represents a focus on efficient multimodal inference, targeting use cases where model size and inference cost matter as much as capability. With a 7B parameter count and $0.20/M pricing, it positions itself as accessible for developers building vision-language applications with moderate compute constraints. The model's emphasis on video analysis and object detection suggests it's engineered for specific computer vision tasks rather than general-purpose multimodal reasoning.
Related Articles
Google releases Gemini 3.1 Flash Lite with 1M context at $0.25 per million input tokens
Google has released Gemini 3.1 Flash Lite, a high-efficiency multimodal model with a 1,048,576 token context window priced at $0.25 per million input tokens and $1.50 per million output tokens. The model supports text, image, video, audio, and PDF inputs with four thinking levels for cost-performance optimization.
Tencent Releases Hy3 Preview: Mixture-of-Experts Model with 262K Context and Configurable Reasoning
Tencent has released Hy3 preview, a Mixture-of-Experts model with a 262,144 token context window priced at $0.066 per million input tokens and $0.26 per million output tokens. The model features three configurable reasoning modes—disabled, low, and high—designed for agentic workflows and production environments.
Google DeepMind Releases Gemma 4 26B A4B Assistant Model for 2x Faster Inference via Multi-Token Prediction
Google DeepMind has released a Multi-Token Prediction assistant model for Gemma 4 26B A4B that achieves up to 2x decoding speedup through speculative decoding. The model uses 3.8B active parameters from a 25.2B total parameter MoE architecture with 128 experts and a 256K token context window.
Google DeepMind releases Gemma 4 with 31B dense model, 256K context window, and speculative decoding drafters
Google DeepMind has released Gemma 4, a family of open-weight multimodal models including a 31B dense model with 256K context window and four size variants ranging from 2.3B to 30.7B effective parameters. The release includes Multi-Token Prediction (MTP) draft models that achieve up to 2x decoding speedup through speculative decoding while maintaining identical output quality.
Comments
Loading...