vLLM supports generative and pooling models across various tasks. If a model supports more than one task, you can set the task via the --task argument. For each task, we list the model architectures ...
No Docker. No Ollama. No ChromaDB. Just works.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results
Feedback