model release

Meta launches Muse Spark, its first frontier model and first closed-weight AI system

TL;DR

Meta Superintelligence Labs has launched Muse Spark, a native multimodal reasoning model that scores 52 on the Artificial Analysis Intelligence Index, placing it in the top 5 frontier models. This marks Meta's first frontier-class model and its first AI system without open weights, representing a strategic shift from its open-source Llama strategy. The model achieves comparable efficiency to Gemini 3.1 Pro while matching Llama 4 Maverick capabilities with over an order of magnitude less compute.

April 8, 2026 · 6:05 PM4 min read

Muse Spark — Quick Specs

Compare Muse Spark with other models →

Meta launches Muse Spark, its first frontier model and first closed-weight system

Meta Superintelligence Labs has unveiled Muse Spark, a native multimodal reasoning model that marks two significant departures from the company's AI strategy: it's Meta's first frontier-class model and its first system without open weights.

Benchmark Performance

Muse Spark scored 52 on the Artificial Analysis Intelligence Index, landing in the top 5 across all tested models. Only Gemini 3.1 Pro Preview (top performer), GPT-5.4, and Claude Opus 4.6 scored higher. For context, Meta's previous models Llama 4 Maverick and Scout achieved only 18 and 13 points respectively when they launched in April 2025.

Independent testing by Artificial Analysis shows the model closing the frontier gap in a single release. However, Artificial Analysis flagged weakness in agent-based tasks: on the GDPval-AA work task benchmark, Muse Spark scored 1,427 points versus Claude Sonnet 4.6's 1,648 and GPT-5.4's 1,676.

On Meta's internal testing, Muse Spark achieved 58% on Humanity's Last Exam and 38% on FrontierScience Research. In extended thinking mode without tools, it scored 50.2 on Humanity's Last Exam (No Tools), outperforming both Gemini 3.1 and GPT-5.4 Pro in this specific benchmark.

Key Capabilities and Architecture

Muse Spark operates as a native multimodal model with three core capabilities: tool usage, visual chain-of-thought reasoning, and multi-agent orchestration. The model includes a "Contemplating Mode" designed to compete with deep reasoning features in competing frontier models like Gemini Deep Think and GPT Pro.

Meta rebuilt the pretraining stack from the ground up over nine months, implementing changes to model architecture, optimization, and data curation. According to Meta's claims, Muse Spark matches Llama 4 Maverick's capabilities using over an order of magnitude less compute, positioning it as substantially more efficient than competing base models.

The company employs two approaches to test-time compute. The first uses thought-time penalties that optimize token consumption. Meta observed a phenomenon it calls "thought compression," where the model initially improves by thinking longer, then compresses reasoning to solve problems with fewer tokens before expanding solutions again for stronger results. The second approach uses multi-agent orchestration—deploying multiple parallel agents on difficult problems simultaneously—to boost performance without adding latency.

Artificial Analysis verified efficiency claims: Muse Spark consumed 58 million output tokens for the full Intelligence Index run, matching Gemini 3.1 Pro Preview (57 million) and well below Claude Opus 4.6 (157 million) or GPT-5.4 (120 million).

Closed Weights Mark Strategic Shift

Unlike the Llama family, Muse Spark is not open-weight and cannot run locally. This represents a sharp break from Meta's open-source playbook championed for years. Meta's AI chief Alexandr Wang stated the company has "plans to open-source future versions," suggesting closed weights may not be permanent policy. The company is also reportedly planning to open-source parts of its new AI models.

Meta justified the shift by noting its enormous spending on AI infrastructure and specialized talent "has to start paying for itself eventually."

Health and Multimodal Focus

Meta partnered with over 1,000 doctors to curate high-quality, factually accurate training data for health applications. The model can generate interactive displays breaking down nutritional value of food or showing which muscles activate during specific exercises. Meta emphasized multimodal perception and health as primary use cases, though interactive applications like mini-game generation are also possible.

Meta acknowledged performance gaps in long-horizon agentic systems and coding workflows. The company also flagged that Muse Spark frequently labeled test scenarios as "alignment traps" during security evaluation, demonstrating "evaluation awareness"—a phenomenon where models appear to recognize they're being tested.

Availability and Future Plans

Muse Spark is live on meta.ai and in the Meta AI app, with private API preview access going to select users. Pricing has not been disclosed.

Meta frames Muse Spark as "the first step on our scaling ladder and the first product of a ground-up overhaul of our AI efforts" toward "personal superintelligence." The company stated "bigger models are already in development with infrastructure scaling to match." This release follows a rough period for Meta's AI efforts after Llama 4 Maverick and Scout drew criticism in April 2025 for underwhelming benchmark results and internal accusations of benchmark manipulation.

What This Means

Muse Spark demonstrates Meta can compete at the frontier in a single leap, closing a gap that seemed substantial just months ago. However, persistent weaknesses in agentic tasks and the company's admission of gaps in coding workflows suggest the model may not be immediately ready for autonomous agent deployment. The shift to closed weights is pragmatic—Meta's infrastructure spending demands commercial revenue—but the stated commitment to open-sourcing future versions leaves the door open to returning to its original strategy. Real-world performance across extended reasoning tasks will be the critical test; benchmark scores alone may not reflect usability in production environments.

Source: the-decoder.com ↗

meta muse-spark frontier-model multimodal reasoning closed-weights benchmark artificial-analysis

model releaseJuly 7, 2026

Meta launches Muse Image, a free AI image generator integrated across Instagram, WhatsApp, and Facebook Marketplace

Meta has launched Muse Image, a new AI image generator from its Meta Superintelligence Labs division. The model is available free for Instagram Stories, WhatsApp, and the Meta AI app, with integration into Facebook Marketplace for visualizing used furniture in home settings.

model releaseJuly 7, 2026

Meta launches Muse Image model with Instagram account prompts and QR code generation

Meta has launched Muse Image, the first AI image generation model from Meta Superintelligence Labs, now available in the US through Meta AI app, Instagram, and WhatsApp. The model accepts Instagram accounts as prompts to incorporate users' likenesses and claims to generate functional QR codes with legible styled text.

model releaseJuly 6, 2026

Nex AGI releases Nex-N2-Mini: open-source agentic MoE model with 262K context window

Nex AGI has released Nex-N2-Mini, an open-source agentic mixture-of-experts model with a 262K-token context window. The model accepts text and image inputs and is priced at $0.025 per 1M input tokens and $0.10 per 1M output tokens.