Self-confidence signals enable unsupervised reward training for text-to-image models
Researchers introduce SOLACE, a post-training framework that replaces external reward models with an internal self-confidence signal derived from how accurately a text-to-image model recovers injected noise. The method enables fully unsupervised optimization and shows measurable improvements in compositional generation, text rendering, and text-image alignment.