The NVIDIA GPU Operator [is] based on the operator framework and automates the management of all NVIDIA software components needed to provision […] GPU worker nodes in a Kubernetes cluster – the driver, container runtime, device plugin and monitoring.
The GPU operator should run on nodes that are equipped with GPUs. To determine which nodes have GPUs, the operator relies on Node Feature Discovery(NFD) within Kubernetes.
https://developer.nvidia.com/blog/nvidia-gpu-operator-simplifying-gpu-management-in-kubernetes/
NVIDIA Container Runtime is a GPU aware container runtime, compatible with the Open Containers Initiative (OCI) specification used by Docker, CRI-O.
https://developer.nvidia.com/nvidia-container-runtime
Starting a GPU enabled CUDA container […] and specify the nvidia runtime:
[edited] $ docker run –rm –runtime=nvidia –gpus=all nvcr.io/nvidia/cuda:latest nvidia-smi
GPUs can be specified to the Docker CLI using either the
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/user-guide.html--gpusoption starting with Docker19.03or using the environment variableNVIDIA_VISIBLE_DEVICES. This variable controls which GPUs will be made accessible inside the container.