Newsletter image

Subscribe to the Newsletter

Join 10k+ people to get notified about new posts, news and tips.

Do not worry we don't spam!

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Search

GDPR Compliance

We use cookies to ensure you get the best experience on our website. By continuing to use our site, you accept our use of cookies, Privacy Policy, and Terms of Service.

Meta - Language Model

Meta Muse Spark

Meta launches Muse Spark (codename Avocado), its first proprietary model from Superintelligence Labs. Open-weight versions planned. Mango world model for video still in development.
2026-04-10
Updated 0

License Proprietary
TL;DR
  • First proprietary model from Meta Superintelligence Labs (MSL)
  • Competitive with Claude Opus 4.6, GPT-5.4, Gemini 3.1 Pro
  • 10x more compute-efficient than Llama 4 Maverick

Meta just shipped its first proprietary AI model. Let that sink in for a moment. The company that open-sourced Llama, built the largest open-weight model ecosystem in history, and positioned itself as the open-source counterweight to OpenAI, has launched Muse Spark, a closed model with no public weights. Internally codenamed "Avocado," it is the first release from Meta Superintelligence Labs (MSL), led by Scale AI cofounder Alexandr Wang after Meta's $14.3 billion deal in 2025.

But before you write Meta off the open-source roster: the company says open-weight versions are coming. And a second model, codenamed Mango, a "world model" for image and video generation, is still in the pipeline. Here is what we know, what it means for open-source builders, and whether Meta's open-weight era is actually over.

What Is Muse Spark?

Muse Spark is a natively multimodal reasoning model announced on April 8, 2026. It replaces the Llama 4 family as Meta's flagship model and introduces a new architecture built from scratch under Wang's team. Key specs:

  • Architecture: New design (not Llama-based), with "Chain of Thought" processing baked into pretraining
  • Efficiency: Meta claims it reaches the same capabilities with over 10x less compute than Llama 4 Maverick
  • Context: Natively multimodal (text + images), with tool-use support and visual chain-of-thought reasoning
  • Modes: Instant (fast responses), Thinking (enhanced reasoning), and Contemplating (multi-agent orchestration, coming soon)

Benchmark Numbers

Meta reports competitive results, though not dominant across the board. Here is how Muse Spark stacks up based on available data:

BenchmarkMuse SparkContext
GPQA Diamond (PhD-level reasoning)89.5%Competitors hit 92.7-94.3%
Humanity's Last Exam (Contemplating)58%Top-tier range
HealthBench Hard42.8%Outperforms rivals
Artificial Analysis Overall52Behind Gemini 3.1 Pro, GPT-5.4, Claude Opus 4.6

Meta acknowledges the model has "gaps in long-horizon agentic systems and coding workflows." That is a notable admission, especially when competitors like GLM-5.1 are shipping 8-hour autonomous coding agents. On Terminal-Bench 2.0, Muse Spark lags behind the top three frontier models.

16 Built-in Tools

Where Muse Spark gets interesting for developers is its integrated tooling. The model ships with 16 built-in tools, including:

  • Code execution: Python 3.9 sandbox with pandas, numpy, matplotlib, scikit-learn, OpenCV
  • Browser tools: Web search, page loading, pattern matching
  • Visual grounding: Object detection with bounding boxes and point coordinates
  • Sub-agent spawning: Delegate subtasks to child agents
  • File operations: View, insert, replace across files
  • HTML/SVG artifacts: Sandboxed rendering of generated code
  • Meta content search: Semantic search across Instagram, Threads, and Facebook posts

The sub-agent spawning is the standout feature. Muse Spark can break complex tasks into subtasks, spin up child agents, and orchestrate them. This is the "agentic stack" that internal documents describe as Meta's core architectural bet.

The Avocado Codename and Its Variants

Before Muse Spark shipped, internal testing revealed multiple Avocado variants being evaluated:

  • Avocado 9B: A smaller 9 billion parameter version for lightweight deployments
  • Avocado Mango Agent: A multimodal variant with image generation capabilities
  • Avocado TOMM: "Tool of Many Models," a routing layer that dispatches to specialized sub-models
  • Avocado Thinking 5.6: The latest iteration of the reasoning-focused variant

The launch was originally planned for March 2026 but was pushed to April after internal tests showed the model "falls short of leading systems" from Google, OpenAI, and Anthropic in reasoning, coding, and writing. During the delay, Meta reportedly routed some requests through Google's Gemini models to fill the gap.

Mango: The "World Model" for Video

The second piece of Meta's strategy is Mango, a separate image and video generation model. Unlike standard diffusion-based generators, Meta describes Mango as a "world model" that aims to understand physics, causality, and temporal continuity. Instead of just generating pixels, Mango is designed to learn the laws of physics alongside visual generation, which in theory should reduce hallucinated motion and impossible physics in generated video.

Mango has no release date yet. When it ships, it will compete directly with Google's Veo, OpenAI's Sora, and the growing ecosystem of open-source video models like Hunyuan Video and Kling.

What Happened to Open Source?

This is the question that matters most for our community. Meta's track record on open weights is unmatched among big tech: Llama 2, Llama 3, Llama 3.1/3.2/3.3, and Llama 4 collectively built the largest open-weight model ecosystem in the industry. So what changed?

Three things:

  1. Alexandr Wang's hybrid strategy. Wang views Meta as a force for "democratizing access," but also believes the largest frontier models carry safety risks that justify keeping them proprietary initially. The plan is to release open-weight versions after the proprietary launch.
  2. Competitive pressure. Muse Spark already trails competitors on several benchmarks. Releasing weights immediately would let rivals finetune on Meta's architecture before Meta has extracted value from it.
  3. Consumer-first positioning. Unlike Anthropic and OpenAI targeting enterprise and government, Meta wants to deploy through WhatsApp, Instagram, and Facebook first. Open weights come later.

Wang has stated publicly: "private API preview open to select partners today, with plans to open-source future versions." That is not a firm commitment, but it is not an abandonment of open source either.

What This Means for Open-Source Builders

The short version: Llama is not dead, but it is no longer Meta's flagship. Muse is the new brand, and the first Muse model is proprietary. If Meta follows through on open-weight releases, the community will eventually get access to a model architecture that is 10x more efficient than Llama 4 Maverick. That would be significant.

In the meantime, the open-weight frontier is well-served:

  • GLM-5.1 (754B MoE, MIT license) now tops SWE-Bench Pro
  • Qwen3.5 (397B-A17B, Apache 2.0) dominates math and coding benchmarks
  • Gemma 4 (31B Dense, Apache 2.0) outperforms Llama 4 on multiple benchmarks

The open-source AI ecosystem does not depend on Meta anymore. But if Meta ships open-weight Muse models with that 10x efficiency gain, it will be a welcome addition to the toolkit.

How to Access Muse Spark Today

Muse Spark is currently available through two channels:

  1. meta.ai: Free access with a Facebook or Instagram login. Includes Instant and Thinking modes.
  2. Private API: Available to select partners only. No public API pricing announced yet.

There is no Hugging Face download, no Ollama pull, no self-hosting option. For now, you are locked into Meta's infrastructure.

The Bottom Line

Meta shipping a proprietary model is a strategic shift, not an ideological one. The company is betting that launching closed-first and opening later gives it a competitive edge while still maintaining developer goodwill. Whether that bet pays off depends entirely on how fast the open-weight versions actually ship.

For open-source builders, the practical impact today is zero. Muse Spark is a hosted service you can try at meta.ai, but you cannot run it locally, finetune it, or build on it. Keep building with Qwen3.5, GLM-5.1, and Gemma 4. When Meta opens the weights, we will be here to benchmark it.

Tested on: N/A (hosted service, no local deployment available)
Date tested: 2026-04-10

Prev Article
Z.ai GLM-5.1
Next Article
OpenThinker-32B

Related to this topic:

No related pages found.