Qwen 3.6: a 27B that beats a 397B, but Alibaba closed the flagship

Subscribe to the Newsletter

Join 10k+ people to get notified about new posts, news and tips.

Do not worry we don't spam!

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Log in

Have no account yet? Sign up

Create an account

Already have an account? Log in

Reset password

Remember your password? Log in

Terms of use

SingularityByte.com values the privacy of our users. Therefore, this privacy policy explains in detail how we use and protect the information we collect when you visit our website.. Read this privacy policy completely. Please refrain from visiting the site if the terms outlined below are not satisfactory to you. We reserve the right to change this policy at any time and will list these changes in the updates section of the policy. By reading this notice and visiting the site, you agree that you understand that customers will not be personally notified when this policy changes. Therefore, we advise our customers to frequently review our privacy policy so that they remain aware of its updates. By using the site, you accept that the posted policy and all its changes apply to your interaction with SingularityByte.com.

Information Collected by SingularityByte.com

Personal information may be collected by this site in many ways. This information includes:

Personal identifying information like your name, address, email, phone number, age, gender, and other personal data
Server data related to the IP address you used to visit our website, which includes your address, browser, OS, access time, and site activity.
Financial information related to your orders including your payment method and identifying payment information. We rarely store financial information collected on our site for transaction purposes. That information gets sent directly to our payment processor.
Social network data including Facebook permissions and user information from other networks, provided you log onto our site using one of these media sites.
Mobile device information such as your device ID, model, and location, if you use our site by accessing trough our website.

How We Use This Information

Our website uses information collected to:
• Manage your account information
• Customize ads
• Deliver promotions
• Email your account confirmation
• Manage purchases and payments
• Increase site efficiency
• Notify you of updates
• Offer new products
• Monitor and prevent theft
• Request your customer feedback
• Resolve account disputes
• Respond to your service requests

Information Disclosure

Normally, your information stays on our site. However, below we have listed the situations that may
require us to share the information we collect from you:
• When required by law, such as for fraud protection
• With our third-party providers for payment processing and hosting
• With your consent for marketing purposes
• When you post comments on the site
• To our advertisers, affiliates, and partners
• If this site goes bankrupt and data must be transferred

Cookies, Trackers, and Online Ads

We may use cookies, trackers, web beacons, and other technology to customize our website to improve your experience. We may customize the site using this information. These trackers do not have access to your personal information and can be removed from your browser options. In addition, third-party software provides ads for our site for marketing campaigns. These programs have access to tracking technology to optimize your ad experience. For more information about these
ads, visit [link to the privacy policies of affiliate advertisers]. Website analytics such as through Google Analytics may also be used to track users
and remarket our website. We do not give these vendors access to your personal information.

Other Sites

Our website may contain links to third-party websites in the form of policies, ads, and other non-affiliated links. Once you leave our site, we are no longer responsible for how your information is collected and disclosed. Please refer to the privacy policies of those third-party sites for more information.

Information Security

We take technical and administrative precautions to protect your data, but we cannot guarantee its safety against all types of fraud or misuse. If you provide personal information, we cannot verify its total security against all types of interception.

Do-Not-Track

Some browsers offer Do-Not-Track settings to prevent any information from being distributed. Since these settings have not been legally established as standard practice, we do acknowledge these settings.

Additional Options

At any time, you may opt to review or change your account settings, including contact information. If you wish to delete your account, you may do so to remove most of your information, however, some identifying information will be retained to prevent fraud.
You may also opt-out of emails and other correspondences from our site at any time.

Microsoft Clarity

We partner with Microsoft Clarity and Microsoft Advertising to capture how you use and interact with our website through behavioral metrics, heatmaps, and session replay to improve and market our products/services. Website usage data is captured using first and third-party cookies and other tracking technologies to determine the popularity of products/services and online activity. Additionally, we use this information for site optimization, fraud/security purposes, and advertising. For more information about how Microsoft collects and uses your data, visit the Microsoft Privacy Statement.

Contact Us

If you have questions or concerns about this privacy policy, please feel free to contact us at: desk@SingularityByte.com

Do you agree to our terms? Sign up

License Apache 2.0

TL;DR

Alibaba's Qwen 3.6 open models: a dense 27B and a 35B-A3B MoE, Apache 2.0, multimodal with tool calling and up to about 1M context.
The dense 27B beats the previous-generation 397B flagship on coding and runs on a single 24GB GPU.
But the flagship Qwen3.6-Max is closed and API-only, a shift from the fully-open Qwen3.5 era.

☍ Announcement ⬇ Download Model

System Requirements

RAM	16GB+
GPU	24GB GPU (RTX 4090)
VRAM	~17GB (27B Q4)

✓ Ollama ✓ Apple Silicon

Table of Contents

Alibaba's Qwen 3.6 is two stories in one. The good one for open-source builders: a dense 27-billion-parameter model that beats the previous generation's 397B Mixture-of-Experts flagship on coding, and runs on a single 24GB GPU, under Apache 2.0. The other one, worth saying out loud: the actual flagship, Qwen3.6-Max, is closed and API-only, a quiet break from the fully-open Qwen3.5 era. Here is what you can download, what you cannot, and how to run the part that is open.

What is open, and what is not

Alibaba's Qwen team released two genuinely open-weight Qwen 3.6 models in April 2026, both Apache 2.0 and both already past five million downloads: Qwen3.6-27B, a dense model, and Qwen3.6-35B-A3B, a Mixture-of-Experts model with about 3 billion active parameters. Both take text, image, and video, call tools, and carry a 256K-token context that stretches to roughly a million with YaRN scaling. What did not ship as weights is Qwen3.6-Max, the roughly trillion-parameter flagship, which exists only behind Alibaba's API. So "Qwen 3.6 is open" is true for the models most people will actually run, and false for the top of the lineup. Do not let a headline conflate them.

The interesting model: a 27B that beats a 397B

The dense 27B is the one to watch. Its trick is architectural: it leans on Gated DeltaNet, a linear-attention design, for three of every four layers, with standard attention in the rest. (Linear attention scales better with context length, which is how a 27B model holds a near-million-token window without melting.) It adds multi-token prediction for faster generation and keeps a thinking mode on by default. The payoff, in Qwen's own numbers, is that this 27B edges out the previous-generation Qwen3.5-397B-A17B on several coding benchmarks while fitting on a single consumer GPU. Intelligence per parameter, not raw scale.

Benchmarks (Qwen-reported)

The numbers below are from Qwen's own launch materials for the 27B. They compare mainly against its predecessors and Claude 4.5 Opus; there is no independent reproduction, and cross-vendor comparisons to Kimi or DeepSeek come only from third-party aggregators, so treat those as community-reported.

Benchmark	Qwen3.6-27B	Note
SWE-Bench Verified	77.2	beats Qwen3.5-397B (76.2)
Terminal-Bench 2.0	59.3	Qwen says on par with Claude 4.5 Opus
LiveCodeBench v6	83.9	coding
GPQA Diamond	87.8	reasoning
AIME 2026	94.1	math

Qwen-reported figures for the dense 27B, not independently reproduced. Comparisons beyond the Qwen3.5 family and Claude are third-party.

Limitations and gotchas

The flagship is closed. Qwen3.6-Max is API-only; the open story stops at 27B and 35B-A3B.
Benchmarks are Qwen-reported. Strong, but unverified, and the headline comparison is against Qwen's own older model.
Vision adds friction for some runners: the vision projector file complicates plain Ollama GGUF use, so llama.cpp, vLLM, or SGLang are the safer paths.
Thinking on by default means more output tokens; turn it off when you just want a quick answer.

Who should use it

If you want a capable, openly licensed model that runs on one 24GB GPU and is genuinely good at code and tool use, the Qwen3.6-27B is one of the best picks of 2026, and a clear upgrade path from Qwen3.5 for local builders. Reach for the 35B-A3B MoE if you have a bit more memory and want faster inference per token. Skip the closed Max unless you specifically need its ceiling and are fine with an API. For most self-hosters, the open 27B is the story.

Run it in about 10 minutes

The dense 27B fits in roughly 17GB at 4-bit, so a 24GB card or a recent Mac runs it.

# Quantized local run via the community GGUF. A 24GB GPU or 32GB Mac is comfortable.
ollama run hf.co/unsloth/Qwen3.6-27B-GGUF:Q4_K_M

# Or serve the full weights with vLLM and the long context
vllm serve Qwen/Qwen3.6-27B --max-model-len 262144 --reasoning-parser qwen3

Give it a real repository task with tools enabled; tool calling and the long context are where the 3.6 generation pulled ahead of 3.5. If you run a coding agent, point a harness like OpenCode at the local endpoint and let it work.

Sources and further reading

Tested on: not independently tested. The Qwen3.6-27B is reported to run in about 17GB at 4-bit on a single 24GB GPU or a 32GB Mac; benchmarks are Qwen-reported and not independently reproduced, with cross-vendor comparisons drawn from third-party aggregators. The flagship Qwen3.6-Max is closed and was not evaluated.
Date checked: 2026-06-26

Subscribe to the Newsletter

Search

GDPR Compliance