This fully managed container orchestrator is purpose-built for modern AI workloads — delivering the scalability, performance, and reliability needed to power your most demanding model training and inference jobs.
Reduce operational complexity by using a secure, streamlined and up-to-date Kubernetes environment that is ready to orchestrate your AI workloads on multi-host installations.
Scale your clusters easily by adding new nodes that have NVIDIA GPU and InfiniBand drivers pre-installed. Combined with the original Kubernetes scalability, this ensures quick compute expansion when needed.
Enjoy predictable AI training and inference experience by running AI workloads on resilient and highly available clusters, covered by system monitoring and Kubernetes auto-healing* mechanisms.
Run multi-host training across thousands of NVIDIA GPUs with minimal effort. Our Managed Kubernetes natively scales GPU clusters over high-speed InfiniBand fabric — plus, it supports a wide range of AI frameworks and job schedulers to extend cluster capabilities.
Deploy and run AI applications in the cloud seamlessly with Managed Kubernetes. Deploy production-ready models on GPU nodes, and natively load-balance web traffic across CPU-only instances within the same cluster.
Kobayashi Applications streamlines access to a curated library of prebuilt images for your AI/ML workloads: From popular inference engines to Kubernetes-native job schedulers, these ready-to-use assets cut setup time so you can launch workloads faster.
A fast and easy-to-use library for LLM inference
Multilingual, strong coding/reasoning, efficient inference via vLLM
Groq hardware-accelerated LLaMA3 model, high-speed LLM inference
Multimodal MoE model, 10M context, efficient inference via vLLM
The Kubernetes-native machine learning (ML) toolkit
Mistral fine-tuned model, efficient LLM inference via vLLM
Multi-user JupyterHub with PyTorch...
Multimodal 128-expert model, 128K context, efficient via vLLM
Local open-source LLM tool, privacy-focused
A browser interface based on Gradio library for Stable Diffusion.
An operator-based system for high-performance workloads with enhanced scheduling and resource management...
Manage your ML experiments in a Kubernetes cluster.
* This feature is currently in development.
Kubernetes is a registered trademark of The Linux Foundation (in the United States and other jurisdictions).