vllm

1 article tagged with vllm

June 18, 2026

Mistral AI traces 400MB/minute memory leak in vLLM to kernel-level mmap calls outside heap

Mistral AI's engineering team documented their investigation of a memory leak in vLLM that caused 400MB/minute memory growth during disaggregated serving with Mistral Medium 3.1. The leak, which only appeared with specific conditions including graph compilation and NIXL-based KV cache transfer, was eventually traced to mmap allocations outside the traditional heap that standard profiling tools couldn't detect.

June 18, 2026 · 8:54 AM

← Back to all news