Newsletter image

Subscribe to the Newsletter

Join 10k+ people to get notified about new posts, news and tips.

Do not worry we don't spam!

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Search

GDPR Compliance

We use cookies to ensure you get the best experience on our website. By continuing to use our site, you accept our use of cookies, Privacy Policy, and Terms of Service.

Ideogram - Image Generation

Ideogram 4

Ideogram 4 is a closed image lab's first open weights: a 9.3B from-scratch flow-matching model with the best text rendering of any open model, running on a single 24GB GPU. The catch is the license: free to play with, 300 dollars a month to ship. The honest version.

License Non-Commercial
License Non-Commercial
TL;DR
  • Ideogram's first open weights: a 9.3B flow-matching Diffusion Transformer, trained from scratch, with best-in-class text rendering.
  • Runs on a single 24GB consumer GPU (nf4) or Apple Silicon (fp8). Code is Apache 2.0; weights are non-commercial.
  • Commercial use needs a 300-dollar-a-month self-serve license or enterprise terms. Open to inspect, not open-source.
System Requirements
RAM16GB+
GPU24GB GPU (RTX 3090/4090)
VRAM24GB (nf4)
✓ Apple Silicon

Ideogram spent years as the image model people reached for when they needed actual legible text in a picture, and it was closed: API only, no weights. On June 3, 2026, that changed. Ideogram 4 is the company's first open-weight release, a 9.3-billion-parameter model trained from scratch, and it runs on a single 24GB consumer GPU. It also beats much larger open models at the thing Ideogram is known for, rendering text. The catch is in the license, and it is a big one: the weights are free to download and play with, but not free to build a business on. Here is the honest version.

What Ideogram opened up

Ideogram 4 is a flow-matching Diffusion Transformer (DiT). (Flow matching trains the model to move noise to image along a near-straight path, which tends to need fewer sampling steps than older diffusion.) It is 9.3B parameters, 34 layers, trained from scratch, and the company is explicit that it is not a fine-tune of FLUX or anything else. It generates at native 2K resolution and any aspect ratio up to 6:1, with a Qwen3-VL-8B vision-language model as the text encoder and a frozen Flux VAE handling image compression.

The standout feature is the one Ideogram built its name on: text and typography. It renders multilingual text in images more reliably than models several times its size, and it supports structured JSON prompting, bounding-box layout control, color palettes, and native transparency. For posters, UI mockups, logos with real words on them, and anything where garbled lettering ruins the result, this is the open model to beat.

The license: open weights, closed for business

This is the part to read twice. Ideogram split the license. The inference code is Apache 2.0, fully open. The weights are not: they ship under an "Ideogram 4 Non-Commercial" license. You can download them, fine-tune them, benchmark them, and use them for research and personal projects. You cannot ship a commercial product on them without paying.

Commercial use runs $300 a month for the self-serve tier (10,000 to 100,000 images a month, self-hosting the quantized weights), with an enterprise tier above that for full-precision weights and higher volume. So "open weights" here means open to inspect, not open-source in the OSI sense, and not free for production. That is a defensible model for a company funding from-scratch training, but call it what it is, and budget for it before you build.

Does it actually deliver?

On Ideogram's own numbers it places second overall behind GPT Image 2 and first among open-weight models in its internal designer arena (1062 ELO). The more interesting evidence is independent: on a ContraLabs typography test, designers picked Ideogram 4 first 47.9% of the time, against 15.5% for FLUX.2 and 15.0% for Grok Imagine. It tops Design Arena's open-weight image leaderboard. The headline for builders: at 9.3B it out-renders text from FLUX.2 (32B) and larger MoE image models, a parameter-efficiency win that is exactly why open weights at this size matter.

CapabilityIdeogram 4Notes
Parameters9.3Bfrom scratch, flow-matching DiT
Text rendering (X-Omni OCR)0.97English text-in-image
Layout (7Bench mIoU)0.69bounding-box control
Native resolution2Kno separate upscaler
Min VRAM (nf4)24GBsingle consumer GPU

Ideogram-reported except where noted; the typography and arena comparisons are third-party (ContraLabs, Design Arena). GenEval and DPG scores were not disclosed.

Limitations and gotchas

  • The weights are non-commercial. Free to experiment; $300 a month or enterprise to ship. This is the headline limitation, not a footnote.
  • The free weights are quantized. Full-precision weights are enterprise-tier only.
  • diffusers supports the nf4 build; fp8 is not wired into diffusers yet. ComfyUI is not officially supported at launch, though community nodes may appear.
  • It is an image model, not a chat model. There is no instruction-editing story here beyond standard image-to-image.

Who should use it

Use it if you generate images with text (marketing, posters, UI, multilingual creative) and you want the best open option that fits on a 24GB card, and your use is research, personal, or you are fine paying the commercial license. The nf4 build runs on an RTX 3090 or 4090; the fp8 build runs on Apple Silicon via Metal. Skip it if you need a permissive, free-for-commercial image model, where a from-scratch Apache 2.0 model is a better fit even if its text rendering is worse. Match the license to your use before you fall in love with the output.

Run it in about 10 minutes

The nf4 weights plus diffusers are the fast path on a single GPU.

# Pull the 4-bit weights (gated: accept the license on the HF page first)
huggingface-cli download ideogram-ai/ideogram-4-nf4
# nf4 fits on a single 24GB GPU. Apache-2.0 code, non-commercial weights.
import torch
from diffusers import Ideogram4Pipeline

pipe = Ideogram4Pipeline.from_pretrained(
    "ideogram-ai/ideogram-4-nf4", torch_dtype=torch.bfloat16
).to("cuda")  # use "mps" on Apple Silicon with the fp8 build

prompt = 'A vintage diner sign that reads "OPEN 24 HOURS" in clean neon'
pipe(prompt).images[0].save("ideogram4.png")

Throw a real text prompt at it, a sign, a poster headline, a label in a non-English script, and compare it side by side with whatever image model you use now. The text is where it wins or it does not, so test the text.

Sources and further reading

Tested on: not independently tested in our environment; we did not run the gated weights. The nf4 build is reported to fit a single 24GB GPU and the fp8 build runs on Apple Silicon via Metal. Quality and benchmark figures are Ideogram-reported except the third-party typography comparisons, flagged above.
Date checked: 2026-06-26

Prev Article
Nemotron 3 Ultra
Next Article
OpenThinker-32B

Related to this topic: