model release

Reka releases Reka Edge, a 7B multimodal model for efficient image and video understanding

TL;DR

Reka has released Reka Edge, a 7-billion parameter multimodal model designed for efficient image and video understanding. The model features a 16,384 token context window and is priced at $0.20 per million input and output tokens.

1 min read
0

Reka Edge — Quick Specs

Context window16K tokens
Input$0.2/1M tokens
Output$0.2/1M tokens

Reka has released Reka Edge, a 7-billion parameter multimodal vision-language model optimized for efficient performance on image understanding, video analysis, object detection, and agentic tool-use tasks.

Model Specifications

Reka Edge is a multimodal model that accepts image, video, and text inputs while generating text outputs. The model features:

  • Parameter count: 7 billion
  • Context window: 16,384 tokens
  • Input pricing: $0.20 per million tokens
  • Output pricing: $0.20 per million tokens
  • Release date: March 20, 2026

Capabilities

According to the model documentation, Reka Edge is optimized for:

  • Image understanding and analysis
  • Video analysis and processing
  • Object detection
  • Agentic tool-use applications

The model is designed as an efficiency-focused alternative, with its 7B parameter count positioning it as a lightweight option compared to larger multimodal models.

Availability

Reka Edge is available through OpenRouter, which routes requests across multiple providers and handles normalization of requests and responses. The platform provides API access with support for various SDKs and frameworks.

The model's pricing structure—identical at $0.20 per million tokens for both input and output—simplifies cost calculation for deployment scenarios.

What this means

Reka Edge represents a focus on efficient multimodal inference, targeting use cases where model size and inference cost matter as much as capability. With a 7B parameter count and $0.20/M pricing, it positions itself as accessible for developers building vision-language applications with moderate compute constraints. The model's emphasis on video analysis and object detection suggests it's engineered for specific computer vision tasks rather than general-purpose multimodal reasoning.

Related Articles

model release

Stable Video 4D 2.0 generates 4D assets from single videos with improved quality

Stability AI has released Stable Video 4D 2.0 (SV4D 2.0), an upgraded version of its multi-view video diffusion model designed to generate 4D assets from single object-centric videos. The update claims to deliver higher-quality outputs on real-world video footage.

model release

OpenAI releases GPT-4o mini with 128K context at $0.15/$0.60 per 1M tokens

OpenAI released GPT-4o mini on July 18, 2024, a compact multimodal model with 128,000 token context window priced at $0.15 per million input tokens and $0.60 per million output tokens. The model achieves 82% on MMLU and claims to rank higher than GPT-4 on chat preference leaderboards while costing 60% less than GPT-3.5 Turbo.

model release

AI2 releases MolmoWeb, open web agent matching proprietary systems with 8B parameters

The Allen Institute for AI has released MolmoWeb, a fully open web agent that operates websites using only screenshots without access to source code. The 8B-parameter model achieves 78.2% success on WebVoyager—nearly matching OpenAI's o3 at 79.3%—while being trained on one of the largest public web task datasets ever released.

model release

Google launches Lyria 3 Pro, extending AI music generation to 3-minute tracks

Google announced Lyria 3 Pro, an upgraded music generation model capable of creating tracks up to three minutes long—a tenfold increase from Lyria 3's 30-second maximum. The model adds structural music understanding (intros, verses, choruses, bridges) and rolls out to Gemini app paid subscribers, Google Vids, ProducerAI, and enterprise tools including Vertex AI, the Gemini API, and AI Studio.

Comments

Loading...