Skip to content

Llm-D

Disaggregated Inference on Kubernetes: Routing, Scheduling, and Scaling Beyond One GPU

August 29, 2025