LLM News

Every LLM release, update, and milestone.

Filtered by:llm✕ clear

research

New test-time training method improves LLM reasoning through self-reflection

Researchers propose TTSR, a test-time training framework where a single LLM alternates between Student and Teacher roles to improve its own reasoning. The method generates targeted variant questions based on analyzed failure patterns, showing consistent improvements across mathematical reasoning benchmarks without relying on unreliable pseudo-labels.

March 5, 2026 · 6:08 AM2 min read

test-time-training reasoning self-improvement

via arxiv.org ↗

research

OSCAR: New RAG compression method achieves 2-5x speedup with minimal accuracy loss

Researchers have introduced OSCAR, a query-dependent compression method for Retrieval-Augmented Generation that speeds up inference 2-5x while preserving accuracy. Unlike traditional approaches, OSCAR compresses retrieved information dynamically at inference time rather than offline, eliminating storage overhead and enabling higher compression rates.

March 5, 2026 · 5:25 AM1 min read

rag retrieval-augmented-generation compression

via arxiv.org ↗

research

LaDiR uses latent diffusion to improve LLM reasoning beyond autoregressive limits

Researchers propose LaDiR, a framework that replaces traditional autoregressive decoding with latent diffusion models to improve LLM reasoning. The approach encodes reasoning steps into compressed latent representations and uses bidirectional attention to refine solutions iteratively, enabling parallel exploration of diverse reasoning paths.

March 5, 2026 · 1:09 AM2 min read

research reasoning diffusion-models

via arxiv.org ↗

research

VeriStruct automates formal verification of Rust data structures with 99.2% function success rate

Researchers have introduced VeriStruct, a framework that extends AI-assisted formal verification from individual functions to complete data structure modules in Verus. The system successfully verified 128 of 129 functions (99.2%) across eleven Rust data structure modules by using a planner module to generate abstractions, type invariants, and proof code, with automatic error correction for Verus syntax.

March 5, 2026 · 12:51 AM2 min read

formal-verification rust verus

via arxiv.org ↗

researchAnthropic

Researchers link pseudonymous users to real identities using AI for under $10 per person

Researchers from ETH Zurich and Anthropic have demonstrated that pseudonymous internet users can be de-anonymized using commercially available AI models at a cost of just a few dollars per person. The attack works in minutes and calls fundamental assumptions about online anonymity into question.

March 1, 2026 · 12:05 PM2 min read

privacy de-anonymization security

via the-decoder.com ↗

model release

Google announces Gemini 3.1 Pro for complex problem-solving tasks

Google announced Gemini 3.1 Pro, positioning the model for complex problem-solving tasks requiring deeper reasoning than previous versions. The release follows Gemini 3 Pro (November 2025) and Gemini 3 Flash (December 2025).

February 20, 2026 · 4:38 AM1 min read

gemini google-deepmind model-release

via 9to5google.com ↗