LLM News

Every LLM release, update, and milestone.

Filtered by:training-stability✕ clear
research

Progressive Residual Warmup improves LLM pretraining stability and convergence speed

Researchers propose Progressive Residual Warmup (ProRes), a pretraining technique that staggers layer learning by gradually warming residual connections from 0 to 1, with deeper layers taking longer to activate. The method demonstrates faster convergence, stronger generalization, and improved downstream performance across multiple model scales and initialization schemes.

research

Stable-LoRA addresses feature learning instability in low-rank adaptation fine-tuning

Researchers have identified a fundamental instability in Low-Rank Adaptation (LoRA), the widely-used parameter-efficient fine-tuning method, and proposed Stable-LoRA as a solution. The new approach uses dynamic weight shrinkage to maintain stable feature learning during training while preserving LoRA's efficiency benefits.