model release

Perceptron Launches Mk1 Vision-Language Model with Video Reasoning at $0.15/$1.50 per 1M Tokens

TL;DR

Perceptron has released Perceptron Mk1, a vision-language model designed for video understanding and embodied reasoning tasks. The model accepts image and video inputs with 33K context window, priced at $0.15 per 1M input tokens and $1.50 per 1M output tokens, and supports structured spatial annotations on demand.

2 min read
0

Perceptron Mk1 — Quick Specs

Context window33K tokens
Input$0.15/1M tokens
Output$1.5/1M tokens

Perceptron Launches Mk1 Vision-Language Model with Video Reasoning

Perceptron has released Perceptron Mk1 (Mark One), a multimodal vision-language model built for video understanding and embodied reasoning tasks. The model processes image and video inputs paired with natural language queries, returning either structured annotations or natural language responses.

Pricing and Context

Perceptron Mk1 is priced at $0.15 per 1M input tokens and $1.50 per 1M output tokens, with a 33K token context window. The model is available through OpenRouter's API routing service.

Core Capabilities

According to Perceptron, Mk1 excels at multiple video understanding tasks including video question answering, summarization, and event detection. For image inputs, the model handles:

  • Point-by-example grounding from multimodal prompts
  • OCR and document parsing on real-world inputs
  • Open vocabulary object detection and counting
  • Hand pose estimation

Structured Annotation System

The model's distinctive feature is its optional structured annotation output. By default, Mk1 returns natural language text only. Users can request spatial localization through the annotation_format parameter:

  • "point" for point annotations on images
  • "box" for bounding boxes
  • "polygon" for polygon masks
  • "clip" for temporal segments (start/end timestamps) in video

Annotations are emitted inline with text only when explicitly requested.

Optional Reasoning Mode

Mk1 includes an optional reasoning mode that can be enabled per request. This trades increased latency for deeper analysis on complex tasks, allowing the model to show step-by-step thinking processes. OpenRouter provides access to the reasoning_details array in API responses.

What This Means

Perceptron Mk1 enters a crowded multimodal model market with a focus on structured output formats and video understanding. The $1.50 per 1M output tokens places it in the premium tier—comparable to GPT-4 Vision pricing. The optional reasoning mode and granular annotation controls suggest the model targets developers building computer vision pipelines and video analysis applications rather than general-purpose chat interfaces. The company has not disclosed benchmark scores or parameter count, making direct performance comparisons difficult.

Related Articles

model release

Mistral OCR 4 Launches With Bounding Boxes, 170 Language Support at $2-4 Per 1,000 Pages

Mistral AI released OCR 4, a compact document extraction model that returns bounding boxes, block classification, and inline confidence scores alongside text. The model supports 170 languages, scores 85.20 on OlmOCRBench, and is priced at $4 per 1,000 pages via API ($2 with batch discount) or $5 per 1,000 pages through Document AI.

model release

US government allows Anthropic to release Claude Mythos 5 to 100+ institutions after two-week export control block

The US Commerce Department has partially lifted export controls on Anthropic's Claude Mythos 5 model, permitting its release to over 100 US institutions including major companies and government agencies. The restrictions, imposed two weeks ago alongside a block on Claude Fable 5, reportedly stemmed from concerns about potential jailbreaks and Chinese access.

model release

Trump administration approves Anthropic's Mythos 5 release to 100 companies and federal agencies

The U.S. government approved Anthropic's release of its Mythos 5 model to roughly 100 companies and federal agencies on Friday. The limited distribution marks a controlled rollout requiring government clearance.

model release

OpenAI restricts GPT-5.6 rollout to government-approved partners, calls arrangement unsustainable

OpenAI released its GPT-5.6 model lineup to a limited group of "trusted partners" after the U.S. government requested restrictions on the rollout. The company released three models—Sol ($5/$30 per million tokens), Terra ($2.50/$15), and Luna ($1/$6)—but said the government-mandated preview "shouldn't become the long-term default."

Comments

Loading...