research
New safety steering technique reduces unsafe T2I outputs without degrading image quality
Researchers introduce Conditioned Activation Transport (CAT), a technique that reduces unsafe content generation in text-to-image models during inference without the quality degradation seen in previous linear steering approaches. The method uses a contrastive dataset of 2,300 safe/unsafe prompt pairs and geometry-based conditioning to target only unsafe activation regions.