Ben Chuanlong Du's Blog

And let it direct your passion with reason.

Docker for Nvidia GPU

Instruction on Using Nvidia GPU (CUDA) for Computing in Docker

  1. Install Nvidia cuda-drivers (or equivalent) on your Linux machine following instructions at CUDA Downloads. Notice that instead of installing cuda (using sudo apt-get install cuda), it is suggested that you install cuda-drivers only (using sudo apt-get install cuda-drivers). This is because the CUDA toolkit (the package cuda) is not needed on your Linux host machine to run GPU-enabled Docker container starting from Docker 19.03. Of course, it doesn't hurt to install the package cuda besides using more disk spaces.

  2. Confirm that the CUDA drivers have been installed correctly.

    nvidia-smi
    

    If the command nvidia-smi is available but raises the following error message,

    NVIDAI-SMI has failed because it couldn't communicate with the NVIDIA driver.
    Make sure that the latest NVIDIA driver is installed and running.`

    reboot your Linux machine and try again.

  3. Make sure that you have Docker 19.03+ installed on your Linux machine.

  4. Install nvidia-docker on your Linux machine. For example, you can install it on the Debian-series of Linux distributions (Debian, Ubuntu, Linux Mint, etc.) using the following comamnds.

    distribution=$(. /etc/os-release; echo $ID$VERSION_ID)
    curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
    curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
    sudo apt-get update 
    sudo apt-get install -y nvidia-docker2
    
  5. Restart Docker. You can use one of the following command (depending on whether systemd is used to manage services).

    sudo systemctl restart docker
    # or 
    sudo service docker restart
    
  6. Test that GPU-enabled Docker containers can be run correctly.

    docker run --gpus all nvidia/cuda:10.2-base nvidia-smi
    
  7. Extend official Nvidia Docker images to customize your own Docker images for GPU applications if needed. nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04 is the best Docker image to extend, generally speaking. If you are using PyTorch (which has built-in CUDA and CUDNN), you can use nvidia/cuda:10.1-base-ubuntu18.04 to reduce the size of your Docker image. floydhub/dockerfiles and PyTorch Dockerfile are good examples to refer to. If you want to use Python packages that do not have built-in CUDA and CUDNN support, you might have to install the library cuda-10-1 manually.

    sudo apt-get install cuda-10-1
    
  8. Run GPU applications in Docker containers. Please refer to nvidia-docker#usage for examples.

General Tips on GPU

  1. You can list all GPU devices using the following command in Linux.

    lspci -v | grep VGA
    
  2. You can use the following command to query the live GPU usage on both Windows and Linux.

    nvidia-smi --query-gpu=timestamp,name,pci.bus_id,driver_version,pstate,pcie.link.gen.max,pcie.link.gen.current,temperature.gpu,utilization.gpu,utilization.memory,memory.total,memory.free,memory.used --format=csv -l 5
    

    For more discussions, please refer to Useful nvidia-smi Queries and Visualize Nvidia GPU Usage . In additional, you can use the command nvtop to check the live usage of GPUs on Linux. nvtop is recommended as it presents simple visualizations in addition to the current usage statistics. nvtop can be install on Ubuntu using the following command.

    apt-get install nvtop
    

Docker Imges

Below are a few good examples of Docker images supporting Nvidia GPU.

References

Comments