LLM News

Every LLM release, update, and milestone.

Filtered by:training-stability✕ clear

research

Progressive Residual Warmup improves LLM pretraining stability and convergence speed

Researchers propose Progressive Residual Warmup (ProRes), a pretraining technique that staggers layer learning by gradually warming residual connections from 0 to 1, with deeper layers taking longer to activate. The method demonstrates faster convergence, stronger generalization, and improved downstream performance across multiple model scales and initialization schemes.

March 6, 2026 · 5:53 AM2 min read

pretraining transformers optimization

via arxiv.org ↗

research

Stable-LoRA addresses feature learning instability in low-rank adaptation fine-tuning

Researchers have identified a fundamental instability in Low-Rank Adaptation (LoRA), the widely-used parameter-efficient fine-tuning method, and proposed Stable-LoRA as a solution. The new approach uses dynamic weight shrinkage to maintain stable feature learning during training while preserving LoRA's efficiency benefits.

March 6, 2026 · 5:24 AM2 min read

low-rank-adaptation parameter-efficient-fine-tuning lora

via arxiv.org ↗