research

RePo: Research Shows Dynamic Positional Encoding Improves LLM Context Understanding

A new research paper introduces RePo, a mechanism that replaces fixed positional encoding with learned, context-aware token positioning. Tested on OLMo-2 1B and 7B models, RePo shows consistent improvements on tasks with noisy contexts and longer sequences while maintaining performance on standard benchmarks.

March 6, 2026 · 5:07 AM2 min read

RePo: Language Models with Context Re-Positioning

Researchers have proposed RePo, a novel mechanism that replaces the rigid positional encoding schemes used in current language models with learned, context-dependent token positioning.

The Problem with Fixed Positional Encoding

Modern large language models assign tokens fixed positional indices—either linearly (position 1, 2, 3...) or via constant schemes. According to the research, this approach is informationally impoverished and increases what the authors call "extraneous cognitive load," based on Cognitive Load Theory. In other words, the model wastes processing capacity on uninformative positional structure instead of allocating it to deeper reasoning and attention patterns.

How RePo Works

RePo introduces a differentiable module, f_φ, that learns to assign token positions dynamically based on contextual dependencies. Rather than relying on predefined sequential order, positions are assigned in a learned, non-linear space that captures the intrinsic structure of the input context.

The mechanism was evaluated through continual pre-training experiments on two OLMo-2 models: the 1B and 7B parameter versions.

Benchmark Results

According to the researchers:

Noisy contexts: RePo consistently enhanced performance on tasks involving noisy or irrelevant information
Structured data: Improvements observed on tasks requiring understanding of structured input
Longer contexts: Better performance on tasks requiring processing of extended sequences
General short-context tasks: Maintained competitive performance alongside existing methods

Detailed analysis reveals that RePo successfully:

Allocates higher attention weights to distant but semantically relevant information
Assigns positions in dense, non-linear space rather than sparse linear arrangements
Captures the intrinsic relational structure of input contexts

Availability

The researchers state they will open-source both code and model weights. Code is currently available at github.com/SakanaAI/repo.

What This Means

Repositioning mechanisms like RePo represent a shift in how researchers think about fundamental LLM architecture. Rather than accepting positional encoding as a solved problem, this work questions whether the standard approaches actually serve model cognition efficiently. If the learned approach generalizes beyond OLMo-2, it could influence how future models handle context—particularly for retrieval-augmented generation, long-context applications, and noisy real-world inputs. The emphasis on cognitive load theory also suggests a more principled approach to architectural design beyond empirical scaling.

Source: arxiv.org ↗

research positional-encoding context-length olmo language-models attention-mechanism pre-training