LLM News | TPS

research

ELMUR extends RL memory horizons 100,000x with structured external memory architecture

Researchers introduce ELMUR, a transformer variant that adds structured external memory to handle long-horizon reinforcement learning problems under partial observability. The system extends effective decision-making horizons beyond standard attention windows by up to 100,000x and achieves 100% success on synthetic tasks with corridors spanning one million steps.

March 5, 2026 · 5:07 AM2 min read

reinforcement-learning transformer-architecture memory-architecture

via arxiv.org ↗