LLM News

Every LLM release, update, and milestone.

Filtered by:representation-learning✕ clear

research

TSEmbed combines mixture-of-experts with LoRA to scale multimodal embeddings across conflicting tasks

Researchers propose TSEmbed, a multimodal embedding framework that combines Mixture-of-Experts (MoE) with Low-Rank Adaptation (LoRA) to handle task conflicts in universal embedding models. The approach introduces Expert-Aware Negative Sampling (EANS) to improve discriminative power and achieves state-of-the-art results on the Massive Multimodal Embedding Benchmark (MMEB).

March 6, 2026 · 6:06 AM2 min read

multimodal-embeddings mixture-of-experts low-rank-adaptation

via arxiv.org ↗

research

Researchers map LLM reasoning as geometric flows in representation space

A new geometric framework models how large language models reason through embedding trajectories that evolve like physical flows. Researchers tested whether LLMs internalize logic beyond surface form by using identical logical propositions with varied semantic content, finding evidence that next-token prediction training leads models to encode logical invariants as higher-order geometry.

March 5, 2026 · 5:24 AM2 min read

interpretability reasoning representation-learning

via arxiv.org ↗

research

Research reveals LLMs internalize logic as geometric flows in representation space

A new geometric framework demonstrates that LLMs internalize logical reasoning as smooth flows—embedding trajectories—in their representation space, rather than merely pattern-matching. The research, which tests logic across different semantic contexts, suggests next-token prediction training alone can produce higher-order geometric structures that encode logical invariants.

March 5, 2026 · 5:21 AM2 min read

research llm-interpretability reasoning

via arxiv.org ↗

research

Meta's NLLB-200 learns universal language structure, study finds

A new study of Meta's NLLB-200 translation model reveals it has learned language-universal conceptual representations rather than merely clustering languages by surface similarity. Using 135 languages and cognitive science methods, researchers found the model's embeddings correlate with actual linguistic phylogenetic distances (ρ = 0.13, p = 0.020) and preserve semantic relationships across typologically diverse languages.

March 5, 2026 · 1:52 AM2 min read

meta-ai nlp machine-translation

via arxiv.org ↗

research

SiNGER framework improves vision transformer distillation by suppressing high-norm artifacts

Researchers introduce SiNGER (Singular Nullspace-Guided Energy Reallocation), a knowledge distillation framework that improves how Vision Transformer features transfer to smaller student models. The method suppresses high-norm artifacts that degrade representation quality while preserving informative signals from teacher models.

March 5, 2026 · 12:52 AM2 min read

vision-transformers knowledge-distillation model-compression

via arxiv.org ↗