Passing an Nvidia GPU through Proxmox to Docker Swarm for Plex

Table of Contents

Finding this helpful?
Please consider leaving a small gesture of your appreciation.

Introduction #

I’d wager that the majority of Homelab setups include some form of media storage and serving solution. For me, this is part of what got me interested in self-hosting initially, and remains one of the core services I run today. I’ve tried a number of different platforms over the years, but nothing has beaten Plex so far (especially thanks to the lifetime Plex Pass!).

The amount of 4K content within my media library is ever growing, and with this I’ve found an increasing need for on-the-fly transcoding, primarily down to needing lower resolutions for remote playback. The servers running my Docker Swarm setup aren’t particularly powerful machines (deliberately so, trying to keep the power bill down!), and while they’re able to handle a single transcode instance without issue, I’ve found that if multiple people try to run a transcoding session simultaneously, then the CPU would become overwhealmed and some buffering would start to occur.

The obvious solution to this issue is to add a GPU to allow hardware transcoding instead of relying on the CPU. However, as I’m finding increasingly often, Docker Swarm isn’t particularly well supported, and passing through a GPU to Plex wasn’t an easy task. I’ve put together this write-up primarily as an aide-memoire to myself when I inevitably have to recreate this setup in the future, however I hope it may be of use to you as well.

Warning: I am by no means an expert at anything detailed below. There’s a good chance that not all of these steps are 100% necessary, however this setup works for me as of writing. Please do your own research, and bear in mind that software (especially video drivers) are constantly evolving, which may break things. My Docker Swarm nodes are each separate Debian 13 VMs, some of this guide is Debian specific. I’m using a Quadro P620 for this setup, however the steps should be broadly similar for other Nvidia GPUs.

Step 1: Host configuration #

Helpfully, Proxmox has a very comprehensive wiki page with details on how to set up a GPU for passthrough. The instructions are tailored towards sharing a single GPU across multiple VMs, which isn’t something I’ve attempted as of yet, however the majority of the instructions are still relevant. Primarily, you need to enable PCIe passthrough - which also has separate documentation if you need it - and then run their helpful pve-nvidia-vgpu-helper utility.

All you need to do is run the following command, and Proxmox will set up everything for you:

root@proxmox:~$ pve-nvidia-vgpu-helper setup

Once that’s done, use the Web UI and navigate to the VM you wish to pass the GPU in to, in my case this is one of the Docker Swarm worker nodes. Under Hardware, add a new PCI Device and select your GPU under Raw Device, making sure to enable All Functions and PCI-Express, but leaving Primary GPU unselected.

Screenshot of Proxmox PCI passthrough configuration

Finally, reboot the Proxmox host

Step 2: Blacklist Nouveau driver #

The rest of our configuration takes place inside the Docker Swarm VM you’ve just passed the GPU in to. Firstly, we need to blacklist the Nouveau driver to stop the GPU being utilised by the VM itself.

root@worker-gpu:~$ echo "blacklist nouveau" > /etc/modprobe.d/blacklist-nouveau.conf

Step 3: Update repositories #

First up is non-free-firmware. This step may vary depending on the version of Debian you’re running, but for me on Debian 13 I just had to edit /etc/apt/sources.list.d/debian.sources and add the following:

Components: main contrib non-free non-free-firmware
X-Repolib-Name: debian
Suites: trixie
Types: deb
URIs: http://deb.debian.org/debian

Secondly, we need to add the Nvidia Container Repository. Firstly, grab the key with:

root@worker-gpu:~$ wget https://nvidia.github.io/libnvidia-container/gpgkey -O /etc/apt/keyrings/nvidia.asc

Then create /etc/apt/sources.list.d/nvidia.sources with the following:

X-Repolib-Name: nvidia
Signed-By: /etc/apt/keyrings/nvidia.asc
Suites: /
Types: deb
URIs: https://nvidia.github.io/libnvidia-container/stable/deb/$(ARCH)

Finally, we also need the Nvidia CUDA repository:

root@worker-gpu:~$ wget https://developer.download.nvidia.com/compute/cuda/repos/debian12/x86_64/cuda-keyring_1.1-1_all.deb
root@worker-gpu:~$ dpkg -i cuda-keyring_1.1-1_all.deb

Step 4: Install packages #

There’s a fairly long list here, and I’m not 100% sure they’re all needed, but this is the combination that worked for me!

root@worker-gpu:~$ apt install gpg linux-headers-amd64 firmware-misc-nonfree nvidia-kernel-dkms nvidia-container-toolkit nvidia-driver nvidia-cuda-dev nvidia-smi nvtop

Step 5: Enable Kernel modules #

root@worker-gpu:~$ dkms autoinstall
root@worker-gpu:~$ reboot

After the reboot, you should be able to run nvtop and see your device. Don’t proceed any further until you can see it!

Step 6: Configure Nvidia Container Runtime for Swarm #

Edit the file /etc/nvidia-container-runtime/config.toml, uncomment the line beginning swarm-resource and change it to the following:

swarm-resource = "DOCKER_RESOURCE_NVIDIA-GPU"

Step 7: Configure Docker daemon #

In order to tell Docker which GPU it is allowed to advertise, we need to find the UUID of our GPU. There are many ways of doing this, but the following one-liner monstrosity worked for me:

root@worker-gpu:~$ cat /proc/driver/nvidia/gpus/0000:`lspci | grep 'VGA compatible controller' | grep 'NVIDIA' | cut -f1 -d' '`/information | grep UUID | cut -f4 -d' '
GPU-0244ed87-119f-3616-853d-d3366a7547512

Once you have the UUID, we can create/edit /etc/docker/daemon.json and add the following, making sure to substitute in your GPU UUID instead:

{
    "default-runtime": "nvidia",
    "node-generic-resources": [
        "NVIDIA-GPU=GPU-0244ed87-119f-3616-853d-d3366a7547512"
    ],
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "/usr/bin/nvidia-container-runtime"
        }
    }
}

Finally, we can restart the Docker service with:

root@worker-gpu:~$ systemctl restart docker.service

Step 8: Create our Plex Compose file #

I’m not going to recreate my entire Plex docker-compose.yml file here, but the Plex service looks something like this:

services:
  plex:
    image: lscr.io/linuxserver/plex:1.42.2
    restart: unless-stopped
    ports:
      - 32400:32400/tcp
      - 8324:8324/tcp
      - 32469:32469/tcp
      - 1900:1900/udp
      - 32410:32410/udp
      - 32412:32412/udp
      - 32413:32413/udp
      - 32414:32414/udp
    env_file:
      - stack.env
    privileged: true
    security_opt:
      - seccomp=unconfined
    deploy:
      resources:
        reservations:
          generic_resources:
            - discrete_resource_spec:
                kind: "NVIDIA-GPU"
                value: 1
    healthcheck:
      test: curl --fail http://localhost:32400/web || exit 1
      interval: 60s
      retries: 5
      start_period: 360s
      timeout: 10s
    volumes:
      - /path/to/config:/config
      - /path/to/movies:/movies
      - /path/to/tv:/tv

My stack.env file contains the following entries:

PLEX_TOKEN=[redacted]
ADVERTISE_IP=[redacted]
PLEX_CLAIM=[redacted]
TZ=Europe/London
PLEX_ADDR=http://plex:32400
NVIDIA_VISIBLE_DEVICES=all
NVIDIA_DRIVER_CAPABILITIES=compute,video,utility,display
PUID=0
PGID=0

Step 9: Profit? #

Time to cross your fingers and kick off a Plex transcode. With any luck, you should be able to see some activity on your GPU when you run nvtop within your VM, and (hw) should appear next to the video on the Plex ‘Now Playing’ dashboard

As mentioned in the introduction, there’s a very good chance that these steps won’t be 100% accurate for you, however I hope it helps you along your way to hardware transcoding happiness!