Nemotron-Labs-TwoTower-30B-A3B-Base-BF16

Name: Nemotron-Labs-TwoTower-30B-A3B-Base-BF16
Author: NVIDIA

NVIDIA🇺🇸 United States

active

Compare with other models →

Version History

base-bf16majorJuly 4, 2026

First release of Nemotron-Labs-TwoTower, a block-wise diffusion model built on the Nemotron-3-Nano-30B backbone. Uses dual-tower architecture to generate blocks of tokens in parallel, claiming 2.42× speedup while retaining 98.7% of baseline quality.

Benchmark Scores

Full leaderboard →

75.6%

HumanEval

78.2%

MMLU

60.9%

MMLU-Pro

Coverage

model releaseNVIDIA

NVIDIA releases Nemotron-Labs-TwoTower-30B: block-wise diffusion model claims 2.42× faster generation at 98.7% baseline

NVIDIA released Nemotron-Labs-TwoTower-30B-A3B-Base-BF16, a block-wise diffusion language model that generates text by denoising blocks of tokens in parallel rather than sequentially. According to NVIDIA, the model achieves 2.42× the wall-clock generation throughput of its autoregressive baseline while retaining 98.7% of aggregate benchmark quality.

July 4, 2026 · 7:51 AM2 min read

NVIDIA Nemotron diffusion models