Instruction on Using Nvidia GPU (CUDA) for Computing in Docker
-
Install Nvidia
cuda-drivers
(or equivalent) on your Linux machine following instructions at CUDA Downloads. Notice that instead of installingcuda
(usingsudo apt-get install cuda
), it is suggested that you installcuda-drivers
only (usingsudo apt-get install cuda-drivers
). This is because the CUDA toolkit (the packagecuda
) is not needed on your Linux host machine to run GPU-enabled Docker container starting from Docker 19.03. Of course, it doesn't hurt to install the packagecuda
besides using more disk spaces. -
Confirm that the CUDA drivers have been installed correctly.
nvidia-smi
If the command
nvidia-smi
is available but raises the following error message,NVIDAI-SMI has failed because it couldn't communicate with the NVIDIA driver.
Make sure that the latest NVIDIA driver is installed and running.`reboot your Linux machine and try again.
-
Make sure that you have Docker 19.03+ installed on your Linux machine.
-
Install nvidia-docker on your Linux machine. For example, you can install it on the Debian-series of Linux distributions (Debian, Ubuntu, Linux Mint, etc.) using the following comamnds.
distribution=$(. /etc/os-release; echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list sudo apt-get update sudo apt-get install -y nvidia-docker2
-
Restart Docker. You can use one of the following command (depending on whether systemd is used to manage services).
sudo systemctl restart docker # or sudo service docker restart
-
Test that GPU-enabled Docker containers can be run correctly.
docker run --gpus all nvidia/cuda:10.2-base nvidia-smi
-
Extend official Nvidia Docker images to customize your own Docker images for GPU applications if needed. nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04 is the best Docker image to extend, generally speaking. If you are using PyTorch (which has built-in CUDA and CUDNN), you can use nvidia/cuda:10.1-base-ubuntu18.04 to reduce the size of your Docker image. floydhub/dockerfiles and PyTorch Dockerfile are good examples to refer to. If you want to use Python packages that do not have built-in CUDA and CUDNN support, you might have to install the library
cuda-10-1
manually.sudo apt-get install cuda-10-1
-
Run GPU applications in Docker containers. Please refer to nvidia-docker#usage for examples.
General Tips on GPU
-
You can list all GPU devices using the following command in Linux.
lspci -v | grep VGA
-
You can use the following command to query the live GPU usage on both Windows and Linux.
nvidia-smi --query-gpu=timestamp,name,pci.bus_id,driver_version,pstate,pcie.link.gen.max,pcie.link.gen.current,temperature.gpu,utilization.gpu,utilization.memory,memory.total,memory.free,memory.used --format=csv -l 5
For more discussions, please refer to Useful nvidia-smi Queries and Visualize Nvidia GPU Usage . In additional, you can use the command
nvtop
to check the live usage of GPUs on Linux.nvtop
is recommended as it presents simple visualizations in addition to the current usage statistics.nvtop
can be install on Ubuntu using the following command.apt-get install nvtop
Docker Imges
Below are a few good examples of Docker images supporting Nvidia GPU.
References
-
https://github.com/NVIDIA/nvidia-docker
-
https://github.com/NVIDIA/nvidia-docker#ubuntu-16041804-debian-jessiestretchbuster
-
https://developer.download.nvidia.com/compute/machine-learning/repos/