Newsletter image

Subscribe to the Newsletter

Join 10k+ people to get notified about new posts, news and tips.

Do not worry we don't spam!

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Search

GDPR Compliance

We use cookies to ensure you get the best experience on our website. By continuing to use our site, you accept our use of cookies, Privacy Policy, and Terms of Service.

Alibaba - Video Generation

Wan 2.1

Alibaba Cloud has launched Wan 2.1, an open-source video generation model that aims to challenge major competitors like OpenAI's Sora. Comprising four variants, Wan 2.1 excels in generating high-quality videos and images from text and image prompts. It leads the VBench leaderboard with an 86.22% score, outperforming other top models. Notably, it supports text effects in both Chinese and English. The model is accessible on Hugging Face under the Apache 2.0 license for academic and restricted commercial use. Alibaba emphasizes community collaboration, encouraging developers to innovate with Wan 2.1. However, concerns about data provenance and safety measures remain, highlighting broader industry challenges. This release marks a significant step in democratizing AI video generation, inviting global participation in its development and usage.
2025-03-02
Updated 2025-03-13 09:23:21

Alibaba Unleashes Wan 2.1: A Game-Changing Open-Source Video Model Hits the AI Scene

In the heart of innovation, a revelation has been unveiled. On February 26, 2025, Alibaba Cloud set the stage ablaze by unveiling Wan 2.1, the latest marvel in video generation technology. This isn't merely a product launch—it's an audacious leap into the future, challenging the giants like OpenAI’s Sora further progressing on the open-source front.

The tech community is abuzz, and for good reason. Wan 2.1 is not just reshaping the landscape of video generation—another giant joins the open-source AI trend.

What’s Wan 2.1 All About?

Wan 2.1 isn’t a single model but a suite of four variants, each tailored for high-quality video and image generation from text and image prompts. Developed by Alibaba’s Wan team, this iteration—first teased in January 2025 as Tongyi Wanxiang—boasts some serious chops:

  • T2V-14B: A 14-billion-parameter text-to-video powerhouse.
  • T2V-1.3B: A lighter 1.3-billion-parameter version, needing just 8.19GB of VRAM—runnable on most modern GPUs like an RTX 4090.
  • I2V-14B-720P: Image-to-video at 720p resolution.
  • I2V-14B-480P: Image-to-video at 480p resolution.

What sets it apart? Alibaba claims Wan 2.1 tops the VBench leaderboard with an 86.22% score, outpacing Sora (84.28%) and other state-of-the-art models in metrics like dynamic motion, spatial relationships, and multi-object interactions. It’s the only open-source model in VBench’s top five, a feat that’s hard to ignore. Plus, it’s the first video model to support text effects in both Chinese and English, adding a unique bilingual twist.

The lightweight T2V-1.3B can whip up a 5-second 480p video in about 4 minutes on an RTX 4090—unoptimized—making it accessible to hobbyists and pros alike. Posts on X have called it “wild” and “unbelievable,” with some suggesting it’s not just competitive but potentially transformative.

 

Official Announcements: Where to Find the Details

Alibaba didn’t bury the lede. Their official rollout came with clear statements.

Alibaba’s messaging is loud and clear: this isn’t just about tech—it’s about community. They’re banking on developers, researchers, and creators to take Wan 2.1 and run with it, much like the 100,000+ derivative models spawned from their Qwen family on Hugging Face.

Ready to dive in? Wan 2.1’s models are live on Hugging Face under the Apache 2.0 license, free for academic and restricted commercial use. Here’s where to grab them:

  • Wan2.1-T2V-14B: Download here. The top-performing text-to-video model, updated February 22, 2025, with inference code and weights.
  • Wan2.1-T2V-1.3B: Details on the 1.3B variant are also hosted on the same page, noted for its 720p potential (though less stable than 480p due to training limits). Use the same link and adjust parameters as needed.

Installation’s straightforward—pip install "huggingface_hub[cli]" followed by a huggingface-cli download command—but check the Hugging Face page for full setup instructions, including API key configs for Alibaba’s Dashscope integration. The models are also on Alibaba’s ModelScope, though Hugging Face is the go-to for most AI enthusiasts.

Why It Matters—and Why You Should Care

Alibaba’s not just tossing out code; they’re throwing down a gauntlet. Open-sourcing Wan 2.1 follows DeepSeek’s January playbook—disrupt the market with cost-efficient, accessible AI. Unlike Sora’s closed ecosystem, Wan 2.1 invites scrutiny and innovation, leveraging a training dataset of 1.5 billion videos and 10 billion images (per Reuters). That’s a scale that raises eyebrows—how’d they pull that off ethically? The announcement skips over safety or data provenance details, leaving room for skepticism.

For your AI toolkit, this is gold. The T2V-1.3B’s low VRAM footprint means you can experiment without a data center, while the 14B variants push cinematic quality—think 8K resolution with physics-aware motion. Early X buzz suggests it’s “faster” and “more coherent” than some rivals, though real-world testing will tell.

The Bigger Picture

This drop comes amid China’s AI ascent—DeepSeek’s infrastructure release on February 28, Qwen 2.5’s May 2024 debut, and now Wan 2.1. Alibaba’s $52 billion bet signals they’re not playing small ball. But it’s not all rosy: the lack of transparency on training data and safety measures mirrors industry-wide gaps. Open-source doesn’t mean risk-free—expect debates on misuse potential to heat up.

For now, Wan 2.1’s a win for the community. Whether it’s a hobby project or a commercial tweak, you’ve got a front-row seat to a video AI revolution. Grab the models, test the hype, and let us know what you think—because if Alibaba’s right, this is just the start.

Prev Article
Proxy Lite-3B
Next Article
QwQ-32B

Related to this topic: