vLLM is an open-source software framework for inference and serving of large language models and related multimodal models. Originally developed at the University of California, Berkeley's Sky Computing Lab, the project is centered on PagedAttention, a memory-management method for transformer key–value caches, and supports features such as continuous batching, distributed inference, quantization, and OpenAI-compatible APIs. According to a project maintainer, the "v" in vLLM originally referred to "virtual", inspired by virtual memory.
4 jobs · 2 companies · last seen 7 April 2026
There are currently 4 open roles in Berlin that mention vLLM, posted by 2 companies. It frequently pairs with LLM, Fine-tuning, Python. Seen most in AI/ML (2% of their roles) and Leadership (1% of their roles).
Frequently asked questions
- How many vLLM jobs are available in Berlin right now?
- There are currently 4 open roles in Berlin that mention vLLM, posted by 2 companies. The board is refreshed multiple times a day — more frequently during Berlin working hours — so new roles can appear throughout the day.
- Which companies in Berlin are hiring for vLLM?
- Companies currently listing the most vLLM roles include: JetBrains, EverAI.
- What skills pair well with vLLM?
- Employers who ask for vLLM also frequently require: LLM (100%), Fine-tuning (100%), Python (75%), Kubernetes (75%), CUDA (75%).
- In which Berlin job roles is vLLM most in demand?
- Based on current listings, vLLM appears most in: AI/ML (2%), Leadership (1%) roles.
Frequently paired with
By seniority
- Senior 25% (1)
- Mid 25% (1)
- Staff / Principal 25% (1)
- Lead / Manager 25% (1)
By job family
% of open roles in each family that mention vLLM
- AI/ML 2% (3/158)
- Leadership 1% (1/129)
By company
Top companies looking for vLLM
-
JetBrains 3 -
EverAI 1
Open roles mentioning vLLM
No jobs match your filters.