LLM News

Every LLM release, update, and milestone.

Filtered by:qwen✕ clear
research

Reinforcement fine-tuning preserves model knowledge better than supervised fine-tuning, study finds

A new study on Qwen2.5-VL reveals reinforcement fine-tuning (RFT) significantly outperforms supervised fine-tuning (SFT) at preserving a model's existing knowledge during post-training adaptation. While SFT enables faster task learning, it causes catastrophic forgetting; RFT learns more slowly but maintains prior knowledge by reinforcing samples naturally aligned with the base model's probability landscape.

research

Researchers identify 'Lazy Attention' problem in multimodal AI training, boost reasoning by 7%

A new paper from arXiv identifies a critical flaw in how multimodal large reasoning models initialize training: they fail to properly attend to visual tokens, a phenomenon researchers call Lazy Attention Localization. The team proposes AVAR, a framework that corrects this through visual-anchored data synthesis and attention-guided objectives, achieving 7% average improvements across seven multimodal reasoning benchmarks when applied to Qwen2.5-VL-7B.

research

Spectral Surgery: Training-Free Method Improves LoRA Adapters Without Retraining

Researchers propose Spectral Surgery, a training-free refinement method that improves Low-Rank Adaptation (LoRA) adapters by decomposing trained weights via SVD and selectively reweighting singular values based on gradient-estimated component sensitivity. The approach achieves consistent gains across Llama-3.1-8B and Qwen3-8B—up to +4.4 points on CommonsenseQA and +2.4 pass@1 on HumanEval—by adjusting only ~1,000 scalar coefficients.

research

Research reveals LLMs internalize logic as geometric flows in representation space

A new geometric framework demonstrates that LLMs internalize logical reasoning as smooth flows—embedding trajectories—in their representation space, rather than merely pattern-matching. The research, which tests logic across different semantic contexts, suggests next-token prediction training alone can produce higher-order geometric structures that encode logical invariants.

model release

Alibaba releases Qwen3.5-35B-A3B-FP8, a quantized multimodal model for efficient deployment

Alibaba's Qwen team released Qwen3.5-35B-A3B-FP8 on Hugging Face, a quantized version of their 35-billion parameter multimodal model. The FP8 quantization reduces model size and memory requirements while maintaining the base model's image-text-to-text capabilities. The model is compatible with standard Transformers endpoints and Azure deployment.

1 min readvia huggingface.co