product update

Multiverse Computing launches API portal for compressed AI models to reduce cloud dependence

TL;DR

Multiverse Computing, a Spanish startup, has launched a self-serve API portal giving developers direct access to compressed versions of models from OpenAI, Meta, DeepSeek, and Mistral AI. The move targets enterprises seeking to reduce cloud infrastructure dependence and lower compute costs through edge deployment. The company claims its HyperNova 60B 2602 model delivers faster responses at lower cost than the original OpenAI model it was derived from.

3 min read
0

Multiverse Computing Launches API Portal for Compressed AI Models

Multiverse Computing, a Spanish AI startup, is pushing compressed models into mainstream enterprise use with the launch of a self-serve API portal that gives developers direct access to smaller, optimized versions of models from OpenAI, Meta, DeepSeek, and Mistral AI.

The move addresses a growing concern in the AI industry: dependence on external compute infrastructure. With private company defaults reaching 9.2% — the highest rate in years — VC firm Lux Capital recently warned companies to formalize cloud compute commitments in writing rather than rely on handshake agreements. Multiverse's approach sidesteps this risk by enabling deployment directly on user devices and enterprise infrastructure.

CompactifAI App and Local Deployment

Multiverse simultaneously launched CompactifAI, a ChatGPT-like AI chat application that showcases the capabilities of its compressed models. The app embeds Gilda, a model small enough to run locally and offline on compatible devices. However, the system has practical limitations: older iPhone models lack sufficient RAM and storage, forcing the app to route requests to cloud-based models via API through a system named Ash Nazg. When routing to the cloud, the privacy advantage of local processing disappears.

The app currently has fewer than 5,000 downloads per month, suggesting it serves as a proof-of-concept rather than a primary revenue driver. The real target is enterprises.

Enterprise API Portal and Cost Reduction

The self-serve API portal, launching today, eliminates the need for AWS Marketplace intermediaries and provides real-time usage monitoring — a critical feature for cost-conscious enterprises. CEO Enrique Lizaso stated the portal "gives developers direct access to compressed models with the transparency and control needed to run them in production."

The primary draw remains clear: lower compute costs. Smaller models also offer advantages for specific use cases, particularly agentic coding workflows where AI autonomously completes multistep programming tasks.

Model Performance Claims

Multiverse's latest compressed model, HyperNova 60B 2602, is built on gpt-oss-120b — an OpenAI model with publicly available source code. According to the company, HyperNova 60B 2602 delivers faster responses at lower cost than the original, though independent benchmarks confirming this claim are not yet available.

This positioning aligns with recent trends in the smaller-model space. Mistral AI this week launched Mistral Small 4, claiming simultaneous optimization for general chat, coding, agentic tasks, and reasoning. The narrowing gap between small and large models is driving enterprise adoption.

Customer Base and Funding

Multiverse already serves more than 100 global customers, including the Bank of Canada, Bosch, and Iberdrola. After raising $215 million in Series B funding last year, the company is reportedly raising €500 million ($540 million USD equivalent) at a valuation exceeding €1.5 billion ($1.6 billion USD equivalent).

The use cases justifying this valuation extend beyond cost optimization: embedded AI in drones, satellites, and connectivity-constrained environments represents a significant market opportunity that requires true edge deployment rather than cloud fallback.

What This Means

Multiverse's API launch signals that compressed models are moving from research artifacts to production-grade tools. For enterprises, the appeal is clear — lower costs, reduced cloud dependence, and privacy benefits for sensitive workloads. The company's existing customer roster and multi-billion-dollar valuation suggest the market is taking this category seriously. However, the narrow gap between Multiverse's claims and independently verified performance remains a key uncertainty.

Related Articles

product update

OpenAI adds Trusted Contact feature to alert emergency contacts when ChatGPT detects self-harm discussions

OpenAI launched an optional Trusted Contact feature for ChatGPT that notifies designated emergency contacts when the system detects discussions about self-harm or suicide. The feature requires manual review by trained personnel before sending notifications, and does not share chat transcripts with contacts.

product update

Anthropic adds dreaming, outcomes, and multiagent orchestration to Claude Managed Agents

Anthropic has released three new capabilities for Claude Managed Agents: dreaming (research preview) for pattern recognition and self-improvement, outcomes for defining success criteria with automated evaluation, and multiagent orchestration for delegating tasks to specialist agents.

product update

AWS launches Amazon Bedrock AgentCore Payments with Coinbase and Stripe for autonomous agent transactions

AWS announced Amazon Bedrock AgentCore Payments (preview), enabling AI agents to autonomously discover and pay for APIs, web content, MCP servers, and other agents. Built with Coinbase and Stripe, the service supports micropayments through the x402 protocol with per-session spending limits and full transaction observability.

product update

Google testing 'Gemini Agent' upgrade that takes actions across apps, makes purchases autonomously

Google is testing a major upgrade to Gemini Agent, internally called "Remy," that can autonomously take actions on users' behalf including making purchases, sharing documents, and communicating with others. The experimental feature, available to Google AI Ultra subscribers, will monitor user preferences and handle complex tasks proactively across connected apps.

Comments

Loading...