LLM News

Every LLM release, update, and milestone.

Filtered by:transformer-architecture✕ clear
research

Meta researchers show flattened speech tokens outperform hierarchical models in Llama-Mimi

Meta researchers propose Llama-Mimi, a speech language model that flattens multi-level RVQ tokens from neural audio codecs into single sequences processed by a standard Transformer decoder. The approach outperforms hierarchical models on most tasks while achieving best-in-class acoustic consistency performance.

research

ELMUR extends RL memory horizons 100,000x with structured external memory architecture

Researchers introduce ELMUR, a transformer variant that adds structured external memory to handle long-horizon reinforcement learning problems under partial observability. The system extends effective decision-making horizons beyond standard attention windows by up to 100,000x and achieves 100% success on synthetic tasks with corridors spanning one million steps.