Install NVIDIA GPU driver on Red Hat Enterprise Linux

This article explains how to install NVIDIA GPU drivers on Red Hat Enterprise Linux (RHEL).

Prerequisites

  • The host systems and the installed GPU devices must meet the system requirements for GPU support.

    For more information, see System requirements for GPU support.

  • Each host system must be updated with the latest OS kernel and packages, and must then be restarted before you install the GPU driver.

  • You must be root or have sudo privileges on the hosts.

Install the NVIDIA GPU driver

The following procedure explains how to install the driver on Red Hat Enterprise Linux.

We recommend that you always use the Linux distribution package manager to install the NVIDIA GPU drivers.

For more information about NVIDIA driver installation in different RHEL distributions, refer to Red Hat Enterprise Linux – NVIDIA Driver Installation Guide in the NVIDIA documentation.

Step 1: Choose the driver version and create an environment variable

Choose the matching driver major version for the GPU based on the recommendations in System requirements for GPU support, and configure this version as a shell environment variable DRIVER_VERSION to be used in the following installation steps.

Example:
Copy
# Example: Using the LTS 535 driver
export DRIVER_VERSION=535

Step 2: Install the driver

  1. Configure the shell environment variables distro and arch for the chosen RHEL distribution and architecture.

    Copy
    export distro=rhel9 # Options: rhel8 or rhel9
    export arch=x86_64
  2. Install the kernel headers and development packages:

    Copy
    sudo dnf -y install kernel-devel-$(uname -r) kernel-headers-$(uname -r)
  3. Satisfy third-party package dependencies:

    Copy
    # RHEL 8
    sudo subscription-manager repos --enable=rhel-8-for-$arch-appstream-rpms
    sudo subscription-manager repos --enable=rhel-8-for-$arch-baseos-rpms
    sudo subscription-manager repos --enable=codeready-builder-for-rhel-8-$arch-rpms
    sudo dnf install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm

    # RHEL 9
    sudo subscription-manager repos --enable=rhel-9-for-$arch-appstream-rpms
    sudo subscription-manager repos --enable=rhel-9-for-$arch-baseos-rpms
    sudo subscription-manager repos --enable=codeready-builder-for-rhel-9-$arch-rpms
    sudo dnf install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm
  4. Enable the network repository in the RHEL package manager (DNF):

    Copy
    sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/$distro/$arch/cuda-$distro.repo
  5. Select the appropriate DKMS driver stream for the specified version.

    Option 1 - default DKMS driver stream

    Copy
    sudo dnf module -y enable nvidia-driver:${DRIVER_VERSION}-dkms/default

    Option 2 - DKMS stream for multi-GPU NVLink systems

    Copy
    sudo dnf module -y enable nvidia-driver:${DRIVER_VERSION}-dkms/fm

    For more information about module streams, see DNF module enablement in the NVIDIA documentation.

  6. Install the driver packages.

    Copy
    sudo dnf install -y cuda-drivers-${DRIVER_VERSION}

Step 3 (optional): Install additional packages for multi-GPU systems

For systems using multiple NVIDIA GPUs with NVLink GPU interconnect, you must install the NVIDIA Fabric Manager as well as the NVSwitch Configuration and Query library (NSCQ) packages for the configured driver version.

This requires that you selected the /fm module stream in the previous step.

Copy
sudo dnf install -y nvidia-fabricmanager-${DRIVER_VERSION} libnvidia-nscq-${DRIVER_VERSION}

Step 4: Restart the system

After you have installed the driver and the additional packages, a system restart is required.

Install the NVIDIA Container Toolkit

Install the NVIDIA Container Toolkit following the procedure Installing the NVIDIA Container Toolkit in the NVIDIA documentation.

Summary of the installation procedure (see the NVIDIA documentation for details):

  1. Install the prerequisites for the following steps.

  2. Add the NVIDIA Container Toolkit production repository to the OS package manager repositories.

  3. Install the nvidia-container-toolkit software package using the package manager.

    Make sure that you pin the software package to a fixed version.

We recommend that you only execute explicit updates to newer container toolkit versions during OS updates, and that you always use pinned package versions to prevent automatic updates.

Updating the driver

To update to the newest release of the same major version of the driver, use normal package updates with dnf.