Perceptron Launches Mk1 Vision-Language Model with Video Reasoning at $0.15/$1.50 per 1M Tokens
Perceptron has released Perceptron Mk1, a vision-language model designed for video understanding and embodied reasoning tasks. The model accepts image and video inputs with 33K context window, priced at $0.15 per 1M input tokens and $1.50 per 1M output tokens, and supports structured spatial annotations on demand.
Perceptron Mk1 — Quick Specs
Perceptron Launches Mk1 Vision-Language Model with Video Reasoning
Perceptron has released Perceptron Mk1 (Mark One), a multimodal vision-language model built for video understanding and embodied reasoning tasks. The model processes image and video inputs paired with natural language queries, returning either structured annotations or natural language responses.
Pricing and Context
Perceptron Mk1 is priced at $0.15 per 1M input tokens and $1.50 per 1M output tokens, with a 33K token context window. The model is available through OpenRouter's API routing service.
Core Capabilities
According to Perceptron, Mk1 excels at multiple video understanding tasks including video question answering, summarization, and event detection. For image inputs, the model handles:
- Point-by-example grounding from multimodal prompts
- OCR and document parsing on real-world inputs
- Open vocabulary object detection and counting
- Hand pose estimation
Structured Annotation System
The model's distinctive feature is its optional structured annotation output. By default, Mk1 returns natural language text only. Users can request spatial localization through the annotation_format parameter:
- "point" for point annotations on images
- "box" for bounding boxes
- "polygon" for polygon masks
- "clip" for temporal segments (start/end timestamps) in video
Annotations are emitted inline with text only when explicitly requested.
Optional Reasoning Mode
Mk1 includes an optional reasoning mode that can be enabled per request. This trades increased latency for deeper analysis on complex tasks, allowing the model to show step-by-step thinking processes. OpenRouter provides access to the reasoning_details array in API responses.
What This Means
Perceptron Mk1 enters a crowded multimodal model market with a focus on structured output formats and video understanding. The $1.50 per 1M output tokens places it in the premium tier—comparable to GPT-4 Vision pricing. The optional reasoning mode and granular annotation controls suggest the model targets developers building computer vision pipelines and video analysis applications rather than general-purpose chat interfaces. The company has not disclosed benchmark scores or parameter count, making direct performance comparisons difficult.
Related Articles
Mistral OCR 4 Launches With Bounding Boxes, 170 Language Support at $2-4 Per 1,000 Pages
Mistral AI released OCR 4, a compact document extraction model that returns bounding boxes, block classification, and inline confidence scores alongside text. The model supports 170 languages, scores 85.20 on OlmOCRBench, and is priced at $4 per 1,000 pages via API ($2 with batch discount) or $5 per 1,000 pages through Document AI.
US government allows Anthropic to release Claude Mythos 5 to 100+ institutions after two-week export control block
The US Commerce Department has partially lifted export controls on Anthropic's Claude Mythos 5 model, permitting its release to over 100 US institutions including major companies and government agencies. The restrictions, imposed two weeks ago alongside a block on Claude Fable 5, reportedly stemmed from concerns about potential jailbreaks and Chinese access.
Trump administration approves Anthropic's Mythos 5 release to 100 companies and federal agencies
The U.S. government approved Anthropic's release of its Mythos 5 model to roughly 100 companies and federal agencies on Friday. The limited distribution marks a controlled rollout requiring government clearance.
OpenAI restricts GPT-5.6 rollout to government-approved partners, calls arrangement unsustainable
OpenAI released its GPT-5.6 model lineup to a limited group of "trusted partners" after the U.S. government requested restrictions on the rollout. The company released three models—Sol ($5/$30 per million tokens), Terra ($2.50/$15), and Luna ($1/$6)—but said the government-mandated preview "shouldn't become the long-term default."
Comments
Loading...