Introduction
In this blog post, we will explore how to leverage an NVIDIA GPU with Docker to supercharge your applications. Docker simplifies the deployment process by handling dependencies seamlessly. With the increasing demand for high-performance computing, more applications are utilizing GPU power to accelerate workloads.
A prime example of this trend is the growing field of Artificial Intelligence (AI). Hosting your own local Large Language Model (LLM), such as Ollama, ensures your data remains private. However, running these models solely on a CPU can be slow. By integrating an NVIDIA GPU into your Docker container, you can significantly speed up processing times.
Another excellent use case is Plex transcoding. An NVIDIA GPU can greatly enhance the efficiency of this process, providing faster and smoother performance.
NVIDIA Drivers host
We need to install the NVIDIA drivers on the host running Docker. This example is used on Ubuntu 24.04.
sudo apt update
sudo apt install nvidia-headless-550-server
sudo apt install nvidia-utils-550-server
NVIDIA Container Toolkit
Before we can pass an NVIDIA GPU to a Docker container, we need to install the NVIDIA Container Toolkit and configure Docker to recognize it. The following example demonstrates this process on Ubuntu 24.04 LTS.
Step 1: Add the NVIDIA Container Toolkit GPG Key and Repository
# Add the gpgpkey and repository
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.lis
Step 2: Update and Install the NVIDIA Container Toolkit
# Update and install NVIDIA Container Toolkit
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
Step 3: Configure the NVIDIA Container Toolkit and Restart Docker
# Configure NVIDIA Container Toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
Step 4: Test the NVIDIA Container Toolkit
# Test GPU integration
docker run --gpus all nvidia/cuda:11.5.2-base-ubuntu20.04 nvidia-smi
When successful, this command will output the result of the nvidia-smi
command, showing the available NVIDIA GPUs:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 550.90.07 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Tesla M10 Off | 00002A73:00:00.0 Off | N/A |
| N/A 30C P8 8W / 53W | 3MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 Tesla M10 Off | 0000F039:00:00.0 Off | N/A |
| N/A 38C P8 8W / 53W | 3MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
NVIDIA GPU Passthrough in Docker
With the NVIDIA Container Toolkit installed, we can now edit the docker-compose.yml
file to enable Docker to use the NVIDIA GPU. Add the following section to allow GPU usage:
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
The count
specifies the number of GPUs to add.
Example with Ollama
This configuration enables the Ollama container to utilize two NVIDIA GPUs, enhancing the performance of AI tasks.
services:
ollama:
volumes:
- ./ollama/ollama:/root/.ollama
container_name: ollama
pull_policy: always
tty: true
restart: unless-stopped
image: ollama/ollama:latest
ports:
- 11434:11434
environment:
- OLLAMA_KEEP_ALIVE=24h
networks:
- ollama-docker
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 2
capabilities: [gpu]
ollama-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: ollama-webui
volumes:
- ./ollama/ollama-webui:/app/backend/data
depends_on:
- ollama
ports:
- 8080:8080
environment: # https://docs.openwebui.com/getting-started/env-configuration#default_models
- OLLAMA_BASE_URLS=http://ollama:11434 #comma separated ollama hosts
- ENV=dev
- WEBUI_AUTH=True
- WEBUI_NAME=WEBUI Name
- WEBUI_URL=http://url
- WEBUI_SECRET_KEY=Secret Key
extra_hosts:
- host.docker.internal:host-gateway
restart: unless-stopped
networks:
- ollama-docker
networks:
ollama-docker:
external: false
Conclusion
Integrating an NVIDIA GPU with Docker can significantly enhance the performance of your applications, particularly those requiring intensive computational power, such as AI and media transcoding. By following the steps to install the NVIDIA Container Toolkit, configure Docker, and test the GPU integration, you can leverage the full potential of your hardware. The example provided with Ollama demonstrates how to configure your docker-compose.yml
file to enable GPU usage, showcasing the practical benefits of this setup. By optimizing your Docker containers to use NVIDIA GPUs, you can ensure faster processing times and improved efficiency for demanding workloads.