For newbies

Nucleus Image: A New Open-Source Sparse MoE Text-to-Image Model

|Author: Viacheslav Vasipenok|3 min read| 7
Nucleus Image: A New Open-Source Sparse MoE Text-to-Image Model

Yes, another text-to-image model just dropped — and this one is genuinely interesting.

Nucleus Image: A New Open-Source Sparse MoE Text-to-Image ModelMeet Nucleus Image, the latest release from Nucleus AI, a small San Francisco-based AI company founded in 2023. While it’s still very early, the model brings something fresh to the open-source image generation scene: a sparse Mixture-of-Experts (MoE) Diffusion Transformer.

Key Highlights

  • 17B total parameters, but only ~2B active per forward pass (thanks to 64 routed experts + 1 shared expert).
  • They call it “the 1st Sparse MoE Diffusion Transformer”.
  • 32-layer architecture, with 29 layers using sparse MoE instead of dense FFN (first 3 layers remain dense for training stability).
  • Uses Grouped-Query Attention.
  • Text encoder: Qwen3-VL-8B-Instruct.
  • VAE: Qwen-Image VAE (16-channel).

The model is released as a base model — no DPO, RLHF, or heavy human preference tuning yet. According to the team, this is intentional: they want to release a strong foundation first.


Training Scale

Nucleus Image: A New Open-Source Sparse MoE Text-to-Image ModelWhat makes it particularly noteworthy is the dataset size:

  • ~1.5 billion image-text training pairs;
  • ~700 million unique images.

That’s serious scale for an open-source effort.


Current Status (as of May 2026)

  • Weights are available;
  • Detailed technical report and model card published;
  • Claimed “Day 0” support in Hugging Face Diffusers;
  • Code has not been released yet (despite heavy “truly open” messaging).

Nucleus Image: A New Open-Source Sparse MoE Text-to-Image ModelFaces are still a bit weak (typical for early base models), but overall image quality looks promising for its efficiency class.

The model is small enough that it should (with some effort) fit into **16GB VRAM**, and the sparse MoE design suggests fast inference relative to its total parameter count.

Also read:


Why This Matters

Nucleus Image: A New Open-Source Sparse MoE Text-to-Image ModelSparse MoE architectures have already revolutionized language models (Mixtral, DeepSeek, etc.). Bringing the same efficiency breakthrough to diffusion models could be a big deal — especially if the code drops and the community starts fine-tuning it.

Nucleus AI is a small team (appears to be 2–10 people based on public profiles), yet they’re swinging big: previously releasing a 22B-token 500B LLM back in 2023. This feels like a serious attempt to compete in the image generation space.

We’ll have a better idea of real-world performance once the full code and inference scripts are public. Until then, you can check out samples on their site:

https://withnucleus.ai/image

Another day, another impressive open-source drop. The image generation race is far from over — and the gap between closed and open models continues to shrink.

Share:
0