LLM News

Every LLM release, update, and milestone.

Filtered by:multimodal-models✕ clear
research

Reinforcement fine-tuning preserves model knowledge better than supervised fine-tuning, study finds

A new study on Qwen2.5-VL reveals reinforcement fine-tuning (RFT) significantly outperforms supervised fine-tuning (SFT) at preserving a model's existing knowledge during post-training adaptation. While SFT enables faster task learning, it causes catastrophic forgetting; RFT learns more slowly but maintains prior knowledge by reinforcing samples naturally aligned with the base model's probability landscape.

research

WAFFLE fine-tuning improves multimodal models for web development by 9 percentage points

Researchers introduce WAFFLE, a fine-tuning methodology that enhances multimodal models' ability to convert UI designs into HTML code. The approach uses structure-aware attention mechanisms and contrastive learning to bridge the gap between visual UI designs and text-based HTML, achieving up to 9 percentage point improvements on benchmark tasks.

benchmark

HSSBench: New benchmark reveals MLLMs struggle with humanities and social sciences reasoning

Researchers have released HSSBench, a new benchmark designed to evaluate multimodal large language models on humanities and social sciences tasks—areas where current benchmarks are sparse. The benchmark contains over 13,000 samples across six key categories in multiple languages, and testing shows even state-of-the-art models struggle significantly with cross-disciplinary reasoning required for HSS domains.