Newsletter image

Subscribe to the Newsletter

Join 10k+ people to get notified about new posts, news and tips.

Do not worry we don't spam!

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Search

GDPR Compliance

We use cookies to ensure you get the best experience on our website. By continuing to use our site, you accept our use of cookies, Privacy Policy, and Terms of Service.

Alibaba - Conversational AI

Alibaba Qwen3.5

Alibaba has released Qwen3.5, headlined by a 397 billion parameter Mixture-of-Experts model with 17 billion active parameters per token. Shipped under Apache 2.0 on Hugging Face, Qwen3.5 scores 93.3% on AIME 2026, 85.0 on LiveCodeBench v6, and 76.8% on SWE-Bench Verified, putting it in frontier territory for math, coding, and agent tasks. The broader Qwen3.5 family spans dense models from sub-1 billion up to 32 billion parameters, plus sparse MoE variants, giving developers open-weight options at every scale.

License Apache 2.0
TL;DR
  • 397B-A17B Mixture-of-Experts flagship under Apache 2.0. Serves at roughly the inference cost of a 17B dense model with the quality of a 400B giant.
  • AIME 2026 at 93.3%, LiveCodeBench v6 at 85.0, SWE-Bench Verified at 76.8%. Frontier-class on math, coding, and agent benchmarks.
  • Full family spans sub-1B dense to the 397B-A17B sparse flagship, every size Apache 2.0, shipped 2026-02-16.
System Requirements
RAM8GB (smallest dense)
GPURTX 4090 24GB (32B dense), 8x H100 (397B)
VRAM8GB to 800GB depending on variant
✓ Ollama ✓ Apple Silicon

On February 16, 2026, Alibaba released Qwen3.5, and the headline model is a 397 billion parameter Mixture-of-Experts giant with just 17 billion active parameters per token. Shipped under Apache 2.0 on Hugging Face, it immediately became one of the most capable open-weight models ever published, with scores on AIME 2026, LiveCodeBench, and SWE-Bench Verified that rival the top proprietary frontier models.

Qwen3.5 397B-A17B: A Big Sparse Bomb

The flagship Qwen3.5 release is an ultra-sparse Mixture-of-Experts:

  • 397 billion total parameters
  • 17 billion active parameters per token (roughly 4.3% activation)
  • Apache 2.0 license, open weights on Hugging Face
  • Part of a broader Qwen3.5 family that spans from sub-1B dense models to the 397B flagship

The sparsity ratio is the key. At 17B active, serving Qwen3.5 is closer in cost to running a 17B dense model than a 397B giant, but the model draws on the full parameter pool during expert routing. That gives it frontier-class quality at mid-tier inference cost.

Benchmarks That Put It in Frontier Territory

Qwen3.5 397B-A17B posts numbers that put it squarely in the top tier of any model, open or closed:

  • AIME 2026 (competition math): 93.3%
  • LiveCodeBench v6 (coding): 85.0
  • SWE-Bench Verified (coding agent): 76.8%

For reference, these are benchmarks where a year ago the best closed frontier API models were scoring in the 60s and 70s. Qwen3.5 is pulling the frontier down into open-weight territory.

A Whole Family, Not Just a Flagship

Alongside the 397B-A17B flagship, Alibaba released smaller Qwen3.5 variants designed for developers who want to run them on consumer or single-GPU hardware. The lineup spans dense models from roughly 600 million parameters up through a 32B dense model, plus the sparse MoE variants at 30B-A3B and the flagship 397B-A17B. Every open-weight size in the family ships under Apache 2.0.

Where to Get It

  1. Hugging Face: browse the Qwen organization for all 3.5 checkpoints.
  2. ModelScope: Alibaba's own model hub for users in China.
  3. Ollama: pull smaller variants for local testing.
  4. Alibaba Cloud API: hosted inference for teams that do not want to self-serve.
  5. GitHub: reference code and examples at QwenLM/Qwen3.5.

Why Qwen Keeps Winning on Open Weights

Alibaba has now shipped a consistent cadence of open-weight flagship releases, from Qwen2 to Qwen2.5, Qwen3, Qwen3-Next, and now Qwen3.5. Each release raises the bar on benchmark scores while keeping the Apache 2.0 license. The Qwen family has become the default base model for a huge share of fine-tuned derivatives on Hugging Face, and Qwen3.5 is the new starting point for anyone building on top of open weights in 2026.

 

Prev Article
Moonshot Kimi K2.5
Next Article
Meta Muse Spark

Related to this topic: