LLM News | TPS

research

Researchers propose WIM rating system to replace subjective numerical scores in LLM training

A new research paper introduces the What Is Missing (WIM) rating system, which generates model output rankings from natural-language feedback rather than subjective numerical scores. The approach integrates into existing LLM training pipelines and claims to reduce ties and increase training signal clarity compared to discrete ratings.

March 6, 2026 · 5:53 AM2 min read

llm-training preference-learning dpo

via arxiv.org ↗