LLM News

Every LLM release, update, and milestone.

Filtered by:state-space-models✕ clear
research

ms-Mamba outperforms Transformer models on time-series forecasting with fewer parameters

Researchers introduced ms-Mamba, a multi-scale Mamba architecture for time-series forecasting that outperforms recent Transformer and Mamba-based models while using significantly fewer parameters. On the Solar-Energy dataset, ms-Mamba achieved 0.229 mean-squared error versus 0.240 for S-Mamba while using only 3.53M parameters compared to 4.77M.

research

Researchers extend Vision Mamba sequence length 4x with separator-based pretraining

Researchers have introduced STAR (Separators for AutoRegressive pretraining), a method that extends Vision Mamba's input sequence length by 4x through strategic separator insertion between images. The STAR-B model achieved 83.5% accuracy on ImageNet-1k, demonstrating improved long-range dependency modeling in vision tasks.