model release

Ideogram Releases First Open-Weight Image Model With 9.3B Parameters and 2K Native Resolution

TL;DR

Ideogram has released Ideogram 4, a 9.3B parameter open-weight text-to-image model trained from scratch. The model features structured JSON prompting, native 2K resolution output, and ranks as the top open-weight model on Design Arena. Available in fp8 and nf4 quantizations under a non-commercial license.

June 3, 2026 · 10:51 PM2 min read

Ideogram 4 — Quick Specs

Compare Ideogram 4 with other models →

Ideogram Releases First Open-Weight Image Model With 9.3B Parameters and 2K Native Resolution

Ideogram has released Ideogram 4, a 9.3B parameter open-weight text-to-image model trained entirely from scratch. The model is available in two quantizations: nf4 (CUDA-only, Diffusers-compatible) and fp8 (cross-platform), both under the Ideogram 4 Non-Commercial license.

Architecture and Technical Specifications

Ideogram 4 uses a fully single-stream Diffusion Transformer (DiT) architecture with 34 layers. Unlike traditional text-to-image models, it concatenates text and image tokens into a unified sequence processed through the same transformer, enabling cross-modal interaction at every layer.

The model uses Qwen3-VL-8B-Instruct as its text encoder instead of CLIP or T5. Hidden states are extracted from 13 intermediate layers and concatenated, providing multi-scale semantic features. The model supports resolutions from 256px to 2048px (in multiples of 16) with aspect ratios up to 6:1.

Benchmark Performance

According to Ideogram, the model ranks first among open-weight models on Design Arena, a third-party image generation leaderboard focused on design tasks. On the overall Design Arena board, Ideogram 4 trails only proprietary models from OpenAI (GPT Image) and Google (Gemini).

In ContraLabs' blind typography evaluation with ten professional designers, Ideogram 4 was selected as best 47.9% of the time, ahead of Gemini 3.1 Flash Image Preview (30.0%), FLUX.2 [max] (15.5%), and Grok Imagine 1.0 (15.0%). The same designers rated it 3.55/5 for real client work usability, higher than competing models.

On standard open-source benchmarks, Ideogram claims the model leads all tested models on layout control (7Bench) and delivers better text rendering than larger open-weight alternatives including Qwen-Image (20B), FLUX.2 [dev] (32B), and HunyuanImage 3.0 (80B MoE).

Key Features

The model introduces structured JSON prompting, allowing explicit control over composition, style, lighting, color palette, typography, and spatial layout through bounding-box coordinates. It supports multilingual text rendering and can generate images at native 2K resolution without upscaling.

Inference code is available on GitHub, with model weights hosted on Hugging Face behind a license gate. The model requires authentication via Hugging Face tokens and optionally integrates with Ideogram's hosted "magic prompt" API for prompt expansion and Hive for safety screening.

What This Means

Ideogram 4 represents a significant release in open-weight text-to-image models, particularly for design-focused applications. The 9.3B parameter count makes it substantially smaller than competing open models like FLUX.2 [dev] (32B) while claiming superior performance on design and typography benchmarks. However, the non-commercial license limits production use cases. The structured JSON prompting interface and native high-resolution support address key limitations of previous open-weight image models, though real-world performance will depend on community validation beyond company-provided benchmarks.

Source: huggingface.co ↗

ideogram open-weight text-to-image diffusion computer-vision multimodal benchmarks

model releaseJuly 16, 2026

Moonshot AI Releases Kimi K3: Open-Weight Multimodal Reasoning Model with 1M Context Window

Moonshot AI has released Kimi K3, an open-weight multimodal reasoning model with a 1-million token context window. The model is priced at $3 per 1M input tokens and $15 per 1M output tokens, available through OpenRouter.

model releaseJuly 17, 2026

Moonshot AI releases Kimi K3, China's largest model at 2.8 trillion parameters

Beijing-based Moonshot AI released Kimi K3, China's largest AI model at 2.8 trillion parameters. The company claims the model consistently outperforms OpenAI's GPT 5.5 and Anthropic's Claude Opus 4.8 on benchmarks including coding and general agents, though it still trails the leading-edge GPT 5.6 Sol and Claude Fable 5 in overall performance.

model releaseJuly 16, 2026

Moonshot AI releases 2.8T parameter Kimi K3, pricing at $3/$15 per million tokens

Chinese AI lab Moonshot AI released Kimi K3, a 2.8 trillion parameter model priced at $3 per million input tokens and $15 per million output tokens. The model is currently available via API, with open weights promised by July 27, 2026. This represents the most expensive pricing from a Chinese AI lab to date, matching Anthropic's Claude Sonnet series.

model releaseJuly 16, 2026

Thinking Machines Lab releases Inkling: 975B-parameter open-weights multimodal model under Apache-2.0

Thinking Machines Lab released Inkling, a Mixture-of-Experts transformer with 975B total parameters and 41B active parameters, trained on 45 trillion tokens of text, images, audio and video. The Apache-2.0 licensed model is designed as a base for fine-tuning rather than a frontier model.

Ideogram Releases First Open-Weight Image Model With 9.3B Parameters and 2K Native Resolution

Ideogram 4 — Quick Specs

Ideogram Releases First Open-Weight Image Model With 9.3B Parameters and 2K Native Resolution

Architecture and Technical Specifications

Benchmark Performance

Key Features

What This Means

Related Articles

Moonshot AI Releases Kimi K3: Open-Weight Multimodal Reasoning Model with 1M Context Window

Moonshot AI releases Kimi K3, China's largest model at 2.8 trillion parameters

Moonshot AI releases 2.8T parameter Kimi K3, pricing at $3/$15 per million tokens

Thinking Machines Lab releases Inkling: 975B-parameter open-weights multimodal model under Apache-2.0

Comments