vllm-inference

Here are 2 public repositories matching this topic...

Huzaifa184 / openshift-rhoai-gpu-platform

Production-pattern Red Hat OpenShift AI 3.4.0 platform with bare-metal ESXi, GPU passthrough, KServe RawDeployment, DeepSeek R1 inference at 12–17 tok/s

kubernetes openshift bare-metal red-hat platform-engineering mlops kserve gpu-inference vllm deepseek rhoai openshift-ai ai-infrastructure vllm-inference

Updated Jun 26, 2026
HTML

RafiBG / L.A.M.S.

Star

A privacy-first Slack bot that integrates local LLMs (Ollama/vLLM/LM Studio) with advanced tools like ComfyUI image generation, SearXNG local search, and On-Demand RAG Memory. Analyze files, execute Python code, and generate music - all while keeping your data inside your own network.

Updated Jun 23, 2026
Python

Improve this page

Add a description, image, and links to the vllm-inference topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the vllm-inference topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly