Find and hire tech professionals

Dice backs GlossaryTech to keep it free for the community

A system designed to run large language models efficiently by managing memory and requests smartly. It is often used to serve modern machine learning models at scale.

First released 2023
Developed by UC Berkeley
Open-source Yes

Interesting facts

vLLM is known for improving throughput and reducing costs when serving large language models.

Sign up for updates
straight to your inbox