LLM News

Every LLM release, update, and milestone.

Filtered by:activation-steering✕ clear

research

Researchers develop inference-time personality sliders for LLMs without retraining

Researchers have developed a parameter-efficient method to control LLM personalities at inference time using Sequential Adaptive Steering (SAS), which orthogonalizes steering vectors to avoid interference when adjusting multiple traits simultaneously. The approach allows users to modulate the Big Five personality dimensions by adjusting numerical coefficients without retraining models.

March 5, 2026 · 5:55 AM2 min read

llm-alignment inference-time-steering parameter-efficiency

via arxiv.org ↗

research

New safety steering technique reduces unsafe T2I outputs without degrading image quality

Researchers introduce Conditioned Activation Transport (CAT), a technique that reduces unsafe content generation in text-to-image models during inference without the quality degradation seen in previous linear steering approaches. The method uses a contrastive dataset of 2,300 safe/unsafe prompt pairs and geometry-based conditioning to target only unsafe activation regions.

March 5, 2026 · 1:08 AM2 min read

text-to-image safety activation-steering

via arxiv.org ↗