Nemotron-Labs-TwoTower-30B-A3B-Base-BF16

NVIDIA🇺🇸 United States
active

Version History

base-bf16major

First release of Nemotron-Labs-TwoTower, a block-wise diffusion model built on the Nemotron-3-Nano-30B backbone. Uses dual-tower architecture to generate blocks of tokens in parallel, claiming 2.42× speedup while retaining 98.7% of baseline quality.

Benchmark Scores

Full leaderboard →
75.6%
HumanEval
78.2%
MMLU
60.9%
MMLU-Pro

Coverage

model releaseNVIDIA

NVIDIA releases Nemotron-Labs-TwoTower-30B: block-wise diffusion model claims 2.42× faster generation at 98.7% baseline

NVIDIA released Nemotron-Labs-TwoTower-30B-A3B-Base-BF16, a block-wise diffusion language model that generates text by denoising blocks of tokens in parallel rather than sequentially. According to NVIDIA, the model achieves 2.42× the wall-clock generation throughput of its autoregressive baseline while retaining 98.7% of aggregate benchmark quality.

2 min read