LLM News | TPS

research

Researchers extend Vision Mamba sequence length 4x with separator-based pretraining

Researchers have introduced STAR (Separators for AutoRegressive pretraining), a method that extends Vision Mamba's input sequence length by 4x through strategic separator insertion between images. The STAR-B model achieved 83.5% accuracy on ImageNet-1k, demonstrating improved long-range dependency modeling in vision tasks.

March 5, 2026 · 5:38 AM2 min read

vision-mamba state-space-models autoregressive-pretraining

via arxiv.org ↗