LLM News

Every LLM release, update, and milestone.

Filtered by:arxiv✕ clear

research

Timer-S1: 8.3B time series foundation model achieves state-of-the-art forecasting on GIFT-Eval

Researchers have introduced Timer-S1, a Mixture-of-Experts time series foundation model with 8.3 billion total parameters and 750 million activated parameters per token. The model achieves state-of-the-art forecasting performance on the GIFT-Eval leaderboard, with the best MASE and CRPS scores among pre-trained models.

March 6, 2026 · 6:09 AM2 min read

LLM News

Timer-S1: 8.3B time series foundation model achieves state-of-the-art forecasting on GIFT-Eval

New technique extends LLM context windows to 128K tokens without expensive retraining

1.58-bit BitNet models naturally support structured sparsity with minimal accuracy loss

Progressive Residual Warmup improves LLM pretraining stability and convergence speed

Study shows LLMs can fact-check using internal knowledge without external retrieval

Researchers Identify 'Contextual Inertia' Bug in LLMs During Multi-Turn Conversations

Researchers propose Mixture of Universal Experts to scale MoE models via depth-width transformation

New framework improves VLM spatial reasoning through minimal information selection

ButterflyMoE achieves 150× memory reduction for mixture-of-experts models via geometric rotations

Research: Contrastive refinement reduces AI model over-refusal without sacrificing safety

New Method Reduces AI Over-Refusal Without Sacrificing Safety Alignment

Researchers use LLMs to simulate misinformation susceptibility across demographics with 92% accuracy

Spectral Surgery: Training-Free Method Improves LoRA Adapters Without Retraining

Study reveals preference leakage bias when LLMs judge synthetically-trained models

Researchers identify and fix critical toggle control failure in multimodal GUI agents

New RLVR method reformulates reward-based LLM training as classification problem

Diffusion language models memorize less training data than autoregressive models, study finds

CoDAR framework shows continuous diffusion language models can match discrete approaches

New benchmark reveals LLMs lose controllability at finer behavioral levels

VC-STaR: Researchers use visual contrast to reduce hallucinations in VLM reasoning

Researchers propose DiSE, a self-evaluation method for diffusion language models

WAFFLE fine-tuning improves multimodal models for web development by 9 percentage points

DynFormer rethinks Transformers for physics simulations, cutting PDE solver errors by 95%

New safety steering technique reduces unsafe T2I outputs without degrading image quality

AI agent outperforms 9 of 10 human hackers in live penetration testing study

DiaBlo: Diagonal Block Finetuning Matches Full Model Performance With Lower Cost

Alignment tuning shrinks LLM output diversity by 2-5x, new research shows

SiNGER framework improves vision transformer distillation by suppressing high-norm artifacts

MedXIAOHE: New medical vision-language model claims state-of-the-art performance on clinical benchmarks

DeepXiv-SDK releases three-layer agentic interface for scientific literature access