model release

Reka releases Reka Edge, a 7B multimodal model for efficient image and video understanding

TL;DR

Reka has released Reka Edge, a 7-billion parameter multimodal model designed for efficient image and video understanding. The model features a 16,384 token context window and is priced at $0.20 per million input and output tokens.

1 min read
0

Reka Edge — Quick Specs

Context window16K tokens
Input$0.2/1M tokens
Output$0.2/1M tokens

Reka has released Reka Edge, a 7-billion parameter multimodal vision-language model optimized for efficient performance on image understanding, video analysis, object detection, and agentic tool-use tasks.

Model Specifications

Reka Edge is a multimodal model that accepts image, video, and text inputs while generating text outputs. The model features:

  • Parameter count: 7 billion
  • Context window: 16,384 tokens
  • Input pricing: $0.20 per million tokens
  • Output pricing: $0.20 per million tokens
  • Release date: March 20, 2026

Capabilities

According to the model documentation, Reka Edge is optimized for:

  • Image understanding and analysis
  • Video analysis and processing
  • Object detection
  • Agentic tool-use applications

The model is designed as an efficiency-focused alternative, with its 7B parameter count positioning it as a lightweight option compared to larger multimodal models.

Availability

Reka Edge is available through OpenRouter, which routes requests across multiple providers and handles normalization of requests and responses. The platform provides API access with support for various SDKs and frameworks.

The model's pricing structure—identical at $0.20 per million tokens for both input and output—simplifies cost calculation for deployment scenarios.

What this means

Reka Edge represents a focus on efficient multimodal inference, targeting use cases where model size and inference cost matter as much as capability. With a 7B parameter count and $0.20/M pricing, it positions itself as accessible for developers building vision-language applications with moderate compute constraints. The model's emphasis on video analysis and object detection suggests it's engineered for specific computer vision tasks rather than general-purpose multimodal reasoning.

Related Articles

model release

Mistral Releases Mistral 3 Family: 675B-Parameter Large 3 MoE and Three Edge Models Under Apache 2.0

Mistral has released Mistral 3, including Mistral Large 3—a sparse mixture-of-experts model with 41B active and 675B total parameters—and three Ministral 3 edge models (3B, 8B, 14B). All models are released under Apache 2.0 license with multimodal capabilities and are available today on multiple platforms.

model release

Google releases Gemini 3.1 Flash Image, claims Pro-level quality at $0.50 per 1M tokens

Google has released Gemini 3.1 Flash Image, internally codenamed "Nano Banana 2," an image generation and editing model with a 131K context window. The model is priced at $0.50 per 1M input tokens and $3 per 1M output tokens.

model release

Mistral OCR 4 Launches With Bounding Boxes, 170 Language Support at $2-4 Per 1,000 Pages

Mistral AI released OCR 4, a compact document extraction model that returns bounding boxes, block classification, and inline confidence scores alongside text. The model supports 170 languages, scores 85.20 on OlmOCRBench, and is priced at $4 per 1,000 pages via API ($2 with batch discount) or $5 per 1,000 pages through Document AI.

model release

Mistral Releases Voxtral TTS: 4B Parameter Text-to-Speech Model at $0.016 per 1k Characters

Mistral AI has released Voxtral TTS, a 4B parameter text-to-speech model supporting 9 languages including English, French, German, Spanish, Dutch, Portuguese, Italian, Hindi, and Arabic. The model achieves 70ms latency for typical inputs and can clone voices from as little as 3 seconds of audio, priced at $0.016 per 1,000 characters.

Comments

Loading...