Best Cheap LLM in 2026

Active AI models ranked by input price per million tokens. Prices from official provider pages.

Updated automatically as pricing changes. Full model database →

Finding the best value LLM

Price alone doesn't tell the whole story. A model that costs twice as much but solves problems in half the calls is actually cheaper. When evaluating cost, consider:

  • Input vs output pricing — for chat and RAG, input is usually 80%+ of your tokens. For generation-heavy tasks (writing, summarization), output price matters more.
  • Context window — larger contexts let you process more in a single call, reducing round trips and total token usage.
  • Open-weight models — if you can self-host, models like Mistral and LLaMA can cost near zero at scale. Check the “open weights” column on the model database.
  • Quality vs cost — compare benchmark scores on the benchmark leaderboard to find the best performance per dollar.

Also see: Best Coding LLM, Best Reasoning LLM, Compare any two models.