LLM News

Every LLM release, update, and milestone.

Filtered by:research-paper✕ clear
research

Researchers propose Mixture of Universal Experts to scale MoE models via depth-width transformation

Researchers have introduced Mixture of Universal Experts (MoUE), a generalization of Mixture-of-Experts architectures that adds a new scaling dimension called virtual width. The approach reuses a shared expert pool across layers while maintaining fixed per-token computation, achieving up to 1.3% improvements over standard MoE baselines and enabling 4.2% gains when converting existing MoE checkpoints.

research

AlignVAR improves image super-resolution with visual autoregression, 10x faster than diffusion models

Researchers propose AlignVAR, a visual autoregressive framework for image super-resolution that addresses critical consistency problems in existing VAR models. The approach combines spatial consistency autoregression and hierarchical consistency constraints to achieve 10x faster inference with 50% fewer parameters than leading diffusion-based methods.

research

New test-time training method improves LLM reasoning through self-reflection

Researchers propose TTSR, a test-time training framework where a single LLM alternates between Student and Teacher roles to improve its own reasoning. The method generates targeted variant questions based on analyzed failure patterns, showing consistent improvements across mathematical reasoning benchmarks without relying on unreliable pseudo-labels.

research

New RL framework CORE helps LLMs bridge gap between solving math problems and understanding concepts

Researchers have identified a critical gap in how large language models learn mathematics: they can solve problems but often don't understand the underlying concepts. A new reinforcement learning framework called CORE addresses this by using explicit concept definitions as training signals, rather than just reinforcing correct final answers.