NVIDIA Nemotron 3 Ultra: a fully open 550B agent model, with the data and recipes

Subscribe to the Newsletter

Join 10k+ people to get notified about new posts, news and tips.

Do not worry we don't spam!

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Log in

Have no account yet? Sign up

Create an account

Already have an account? Log in

Reset password

Remember your password? Log in

Terms of use

SingularityByte.com values the privacy of our users. Therefore, this privacy policy explains in detail how we use and protect the information we collect when you visit our website.. Read this privacy policy completely. Please refrain from visiting the site if the terms outlined below are not satisfactory to you. We reserve the right to change this policy at any time and will list these changes in the updates section of the policy. By reading this notice and visiting the site, you agree that you understand that customers will not be personally notified when this policy changes. Therefore, we advise our customers to frequently review our privacy policy so that they remain aware of its updates. By using the site, you accept that the posted policy and all its changes apply to your interaction with SingularityByte.com.

Information Collected by SingularityByte.com

Personal information may be collected by this site in many ways. This information includes:

Personal identifying information like your name, address, email, phone number, age, gender, and other personal data
Server data related to the IP address you used to visit our website, which includes your address, browser, OS, access time, and site activity.
Financial information related to your orders including your payment method and identifying payment information. We rarely store financial information collected on our site for transaction purposes. That information gets sent directly to our payment processor.
Social network data including Facebook permissions and user information from other networks, provided you log onto our site using one of these media sites.
Mobile device information such as your device ID, model, and location, if you use our site by accessing trough our website.

How We Use This Information

Our website uses information collected to:
• Manage your account information
• Customize ads
• Deliver promotions
• Email your account confirmation
• Manage purchases and payments
• Increase site efficiency
• Notify you of updates
• Offer new products
• Monitor and prevent theft
• Request your customer feedback
• Resolve account disputes
• Respond to your service requests

Information Disclosure

Normally, your information stays on our site. However, below we have listed the situations that may
require us to share the information we collect from you:
• When required by law, such as for fraud protection
• With our third-party providers for payment processing and hosting
• With your consent for marketing purposes
• When you post comments on the site
• To our advertisers, affiliates, and partners
• If this site goes bankrupt and data must be transferred

Cookies, Trackers, and Online Ads

We may use cookies, trackers, web beacons, and other technology to customize our website to improve your experience. We may customize the site using this information. These trackers do not have access to your personal information and can be removed from your browser options. In addition, third-party software provides ads for our site for marketing campaigns. These programs have access to tracking technology to optimize your ad experience. For more information about these
ads, visit [link to the privacy policies of affiliate advertisers]. Website analytics such as through Google Analytics may also be used to track users
and remarket our website. We do not give these vendors access to your personal information.

Other Sites

Our website may contain links to third-party websites in the form of policies, ads, and other non-affiliated links. Once you leave our site, we are no longer responsible for how your information is collected and disclosed. Please refer to the privacy policies of those third-party sites for more information.

Information Security

We take technical and administrative precautions to protect your data, but we cannot guarantee its safety against all types of fraud or misuse. If you provide personal information, we cannot verify its total security against all types of interception.

Do-Not-Track

Some browsers offer Do-Not-Track settings to prevent any information from being distributed. Since these settings have not been legally established as standard practice, we do acknowledge these settings.

Additional Options

At any time, you may opt to review or change your account settings, including contact information. If you wish to delete your account, you may do so to remove most of your information, however, some identifying information will be retained to prevent fraud.
You may also opt-out of emails and other correspondences from our site at any time.

Microsoft Clarity

We partner with Microsoft Clarity and Microsoft Advertising to capture how you use and interact with our website through behavioral metrics, heatmaps, and session replay to improve and market our products/services. Website usage data is captured using first and third-party cookies and other tracking technologies to determine the popularity of products/services and online activity. Additionally, we use this information for site optimization, fraud/security purposes, and advertising. For more information about how Microsoft collects and uses your data, visit the Microsoft Privacy Statement.

Contact Us

If you have questions or concerns about this privacy policy, please feel free to contact us at: desk@SingularityByte.com

Do you agree to our terms? Sign up

License OpenMDW 1.1

TL;DR

NVIDIA's agentic flagship: a 550B hybrid Mamba-Transformer MoE (55B active) with a 1M-token context, pretrained in NVFP4.
Ships weights, training data, recipes, and RL environments under the permissive OpenMDW-1.1, commercial use allowed.
The most intelligent US open-weights model (Artificial Analysis index 48); needs a Blackwell or Hopper node to run.

☍ Announcement ⬇ Download Model

System Requirements

RAM	data-center
GPU	8x H200 / 4x B200
VRAM	~275GB+ NVFP4

✓ Ollama

Table of Contents

On June 4, 2026, NVIDIA dropped Nemotron 3 Ultra, and for once the "open" in "open model" is the whole point. It is a 550-billion-parameter model built for long-running agents, it tops Artificial Analysis as the most intelligent US open-weights model, and NVIDIA shipped not just the weights but the training data, the recipes, and the reinforcement-learning environments. It also did something quieter that matters more to builders: it moved off NVIDIA's restrictive model license onto a permissive one that allows commercial use without strings. Here is what it is, what is actually open, and why you still cannot run it at home.

What NVIDIA shipped, and which Nemotron this is

First, disambiguation, because NVIDIA's naming is a maze. Nemotron 3 Ultra is the June 2026 agentic flagship, the largest of the Nemotron 3 family (Nano, Super, Ultra). It is not Nemotron-Cascade 2, a separate and much smaller 30B model from March that was a post-training-recipe showcase. When people say "the new open NVIDIA model," they mean Ultra.

Ultra is a hybrid. Instead of a pure Transformer, it interleaves Mamba-2 state-space layers with attention and a Mixture-of-Experts design NVIDIA calls LatentMoE. (Mamba layers process long sequences in linear time rather than attention's quadratic cost, which is how you get a sane 1M-token context.) The shape: 550B total, 55B active per token, 512 experts with the top 22 routed, 108 layers, and native Multi-Token Prediction for speculative decoding. It was pretrained directly in 4-bit NVFP4 on about 20 trillion tokens. Text only, 12 languages.

The part that is actually open

Plenty of "open" models give you a weights file and nothing else. Nemotron 3 Ultra ships four checkpoints (base, post-trained, NVFP4, and a reward model), plus the pretraining data, the post-training recipes, and the RL environments, all on GitHub under NVIDIA-NeMo. You can reproduce the post-training, not just run the result.

And the license changed. Older Nemotron models used the NVIDIA Open Model License, which carried restrictions. Ultra ships under OpenMDW-1.1, a permissive, MIT-style license that grants commercial and non-commercial use, allows redistribution, and puts no restrictions on the model's outputs. For a company that has historically been careful with its licenses, that is a real shift toward genuine openness.

Benchmarks: the strongest US open model, with caveats

NVIDIA's own numbers, from the technical report, put Ultra firmly in the agentic mix. Independent confirmation comes from Artificial Analysis, which scored it at 48 on its Intelligence Index and called it the most intelligent US open-weights model as of June 2026. The honest footnote: Chinese open models still lead the overall open frontier (GLM-5.2 and Kimi score higher), so "best US open model" is the accurate claim, not "best open model."

Benchmark	Nemotron 3 Ultra	What it measures
SWE-Bench Verified	70.7	real-world coding fixes
Terminal-Bench 2.1	56.4	terminal and coding agents
Tau-Bench V3	70.9	tool-use agents
RULER @ 1M	94.7	long-context recall
IOI 2025	570/600	competitive programming

NVIDIA-reported (technical report); Ultra run with TensorRT-LLM. The Intelligence Index ranking is independent (Artificial Analysis). Not reproduced by us.

Where it leads is long-context and agent throughput: RULER at 1M context (94.7) tops its peers, and NVIDIA claims roughly 5 to 6 times the throughput of GLM-5.1 and Kimi K2.6 on long agentic runs, the payoff of the Mamba hybrid and native NVFP4. Where it trails: raw coding-agent scores like Terminal-Bench, where Kimi sits higher.

Limitations and gotchas

Not local. 550B in BF16 needs 8x H200 or 16x H100; even 4-bit NVFP4 wants 4x B200 or 8x H100. There is no consumer-GPU or Apple Silicon path.
Best on Blackwell. NVFP4 is native on Blackwell; on older cards you fall back to a W4A16 path, which works but gives up part of the speed story.
Text only. No vision or audio in Ultra; NVIDIA ships separate ASR and safety models.
Benchmarks are NVIDIA-reported. The independent Artificial Analysis index corroborates the tier, not every number.

Who should use it

Use it if you are building long-horizon agents (multi-step planning, tool use, sub-agent delegation across hundreds of turns) and you want a fully open, commercially usable base you can fine-tune and self-host on a Blackwell or Hopper node. It integrates with CrewAI and LangChain agent stacks out of the box. If you want something you can run on your own GPU, this is not it; reach for a smaller Nemotron (Nano or Super) or a mid-size MoE. The reason to care about Ultra even if you cannot run it: the open data and recipes are a gift to anyone training their own agent model.

Run it in about 10 minutes

Realistically that means a hosted endpoint or a rented multi-GPU box. The fastest taste is the NVIDIA NIM playground or OpenRouter; the self-host path is vLLM.

# Hosted: try it free on OpenRouter or build.nvidia.com first.

# Self-host NVFP4 on 8x H100 (or 4x B200) with vLLM
docker run --gpus all vllm/vllm-openai:v0.22.0 \
  --model nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-NVFP4 \
  --tensor-parallel-size 8

Then point a CrewAI or LangChain Deep Agents loop at the endpoint and hand it a task that needs planning and tool calls across many turns. That is the workload Ultra was post-trained for, and where the throughput advantage shows up. If you train agent models yourself, the real ten-minute move is reading the post-training recipes on GitHub, that is the part nobody else open-sourced.

Sources and further reading

Tested on: not independently tested. Nemotron 3 Ultra is a 550B hybrid MoE that needs a Blackwell or Hopper multi-GPU node even at NVFP4, beyond our bench. Benchmarks are NVIDIA-reported; the open-weights ranking is from Artificial Analysis, flagged as third-party. Sources linked above.
Date checked: 2026-06-26

Subscribe to the Newsletter

Search

GDPR Compliance