LLM News

Every LLM release, update, and milestone.

Filtered by:robotics✕ clear

benchmark

RoboMME benchmark reveals memory architecture trade-offs in robotic vision-language models

Researchers introduce RoboMME, a large-scale standardized benchmark for evaluating memory in robotic vision-language-action (VLA) models across 16 manipulation tasks. The study tests 14 memory-augmented VLA variants and finds that no single memory architecture excels across all task types—each design offers distinct trade-offs depending on temporal, spatial, object, and procedural demands.

March 6, 2026 · 5:50 AM2 min read

benchmark robotics vision-language-action

via arxiv.org ↗

research

RealWonder generates physics-accurate videos in real-time from single images

Researchers introduce RealWonder, a real-time video generation system that simulates physical consequences of 3D actions by using physics simulation as an intermediate representation. The system generates 480x832 resolution videos at 13.2 FPS from a single image, handling rigid objects, deformable bodies, fluids, and granular materials.

March 6, 2026 · 5:22 AM2 min read

video-generation physics-simulation action-conditioning

via arxiv.org ↗

research

ELMUR extends RL memory horizons 100,000x with structured external memory architecture

Researchers introduce ELMUR, a transformer variant that adds structured external memory to handle long-horizon reinforcement learning problems under partial observability. The system extends effective decision-making horizons beyond standard attention windows by up to 100,000x and achieves 100% success on synthetic tasks with corridors spanning one million steps.

March 5, 2026 · 5:07 AM2 min read

reinforcement-learning transformer-architecture memory-architecture

via arxiv.org ↗

research

REFLEX framework gives LLMs metacognitive reasoning for zero-shot robot planning

Researchers present REFLEX, a framework that equips LLM-powered robotic agents with metacognitive capabilities—skill decomposition, failure reflection, and solution synthesis—to perform complex tasks in zero-shot and few-shot settings. The system significantly outperforms existing baselines and demonstrates that LLMs can generate creative solutions that diverge from ground truth while still completing tasks successfully.

March 5, 2026 · 1:10 AM2 min read

robotics large-language-models metacognition

via arxiv.org ↗

funding

Fei-Fei Li's World Labs raises $1B to develop spatial intelligence AI systems

World Labs, the AI startup founded by Fei-Fei Li, has raised $1 billion in new funding to develop spatial intelligence—AI systems capable of understanding and operating in three-dimensional physical environments. The capital will fund the development of world models, a class of AI architecture designed to reason about spatial relationships and physical interactions.

February 20, 2026 · 4:38 AM2 min read

funding world-labs spatial-intelligence

via the-decoder.com ↗