LLM News

Every LLM release, update, and milestone.

Filtered by:transformer-architecture✕ clear

research

Meta researchers show flattened speech tokens outperform hierarchical models in Llama-Mimi

Meta researchers propose Llama-Mimi, a speech language model that flattens multi-level RVQ tokens from neural audio codecs into single sequences processed by a standard Transformer decoder. The approach outperforms hierarchical models on most tasks while achieving best-in-class acoustic consistency performance.

March 6, 2026 · 5:37 AM2 min read

speech-language-models audio-tokenization transformer-architecture

via arxiv.org ↗

research

ELMUR extends RL memory horizons 100,000x with structured external memory architecture

Researchers introduce ELMUR, a transformer variant that adds structured external memory to handle long-horizon reinforcement learning problems under partial observability. The system extends effective decision-making horizons beyond standard attention windows by up to 100,000x and achieves 100% success on synthetic tasks with corridors spanning one million steps.

March 5, 2026 · 5:07 AM2 min read

reinforcement-learning transformer-architecture memory-architecture

via arxiv.org ↗