Qwen-Image: A New Open-Source 20B MMDiT Model for Image Generation

August 4, 2025 at 09:43 PM|Updated: July 26, 2026 at 01:06 PM|Author: Viacheslav Vasipenok|2 min read| 2141

Alibaba’s Qwen team has unveiled Qwen-Image, a groundbreaking 20-billion-parameter MMDiT (Multimodal Diffusion Transformer) model, now available as an open-source tool for image generation. Hosted on Hugging Face, this model is making waves for its exceptional ability to create images with natively rendered text, setting it apart in the AI creative space.

Key Highlights:

State-of-the-Art Text Rendering: Qwen-Image delivers top-tier performance in text rendering, rivaling GPT-4o in English and leading its class in Chinese. It excels at handling complex layouts, multi-line text, and fine details with impressive accuracy.
Bilingual Support and Diverse Fonts: The model supports both English and Chinese seamlessly, adapting to a variety of fonts and understanding intricate typographic nuances, making it a versatile choice for multilingual designs.

Creative Versatility: Beyond text, Qwen-Image shines in generating images across a wide range of styles. From photorealistic scenes to vibrant anime, from impressionist masterpieces to minimalist designs, it adapts fluidly to creative prompts. This flexibility makes it a powerful tool for artists, designers, and content creators looking to explore diverse visual aesthetics.

Subscribe to our newsletter

Get the latest Web3, AI, and crypto news delivered straight to your inbox.