LLM News

Every LLM release, update, and milestone.

Filtered by:context-window✕ clear

research

New technique extends LLM context windows to 128K tokens without expensive retraining

Researchers propose a novel framework called SharedLLM that extends language model context windows from 8K to 128K tokens without costly continual pre-training. The method uses two stacked short-context models—one as a compressor, one as a decoder—with specialized tree-based information retrieval, achieving 2-3x inference speedups while maintaining competitive performance.

March 6, 2026 · 6:06 AM2 min read

context-window compression llm-architecture

via arxiv.org ↗

research

Neural Paging System Reduces LLM Context Management Complexity from O(N²) to O(N·K²)

A new research paper introduces Neural Paging, a hierarchical architecture that optimizes how LLMs manage their limited context windows by learning semantic caching policies. The approach reduces asymptotic complexity for long-horizon reasoning from O(N²) to O(N·K²) under bounded context window size K, addressing a fundamental bottleneck in deploying universal agents with external memory.

March 5, 2026 · 1:39 AM2 min read

llm-research context-window memory-management

via arxiv.org ↗

benchmarkOpenAI

Frontier LLMs lose up to 33% accuracy in long conversations, study finds

Frontier language models including GPT-5.2 and Claude 4.6 experience accuracy degradation of up to 33% as conversations lengthen, according to new research. The finding suggests that extended context use within a single conversation introduces performance challenges even in state-of-the-art models.

February 28, 2026 · 6:05 PM2 min read

accuracy benchmarks context-window

via the-decoder.com ↗