CUDA and PyTorch
- 1 Introduction
- 2 CUDA
- 2.1 Jetson TX2i
- 2.2 Compute Capability
- 3 PyTorch
- 3.1 No Forward Compatibility
- 3.2 Releases
- 4 Ultralytics
- 5 Upgrading
Introduction
NVIDIA CUDA versioning is complicated, and PyTorch versioning depending on CUDA and Python versions make it even more complicated.
CUDA
NVIDIA documentation is fragmented and Jetson support beyond the basics is done by a single developer (Dustin Franklin):
Jetson TX2i
Jetpack is the package flashed to the Jetson, including the operating system Linux4Tegra and drivers.
CUDA on a Jetson TX2i is directly tied to the Jetpack version, and cannot be upgraded independently. As Jetpack 4.6.3 is the latest and last version to be supported on a Jetson TX2i, the CUDA versions are fixed:
CUDA 10.2
cuDNN 8.2.1
TensorRT 8.2.1
Jetpack 4.6.3 webpage: https://developer.nvidia.com/jetpack-sdk-463
Compute Capability
CUDA compute capability is tied to the NVIDIA hardware. Not all versions of CUDA are supported by the CUDA compute capability. Specifically, the latest and last hardware supported by:
CUDA 10.2: Compute capability 7.5 (i.e. Turing (e.g. GeForce GTX 16 Series and RTX 20 Series))
CUDA 11.7: Compute capability 8.7 (i.e. all of Ampere (e.g. GeForce RTX 30 Series))
CUDA support matrix: https://en.wikipedia.org/wiki/CUDA#GPUs_supported
PyTorch
Each PyTorch version has a minimum CUDA version that it can use. Specifically:
PyTorch 1.13.1 requires CUDA 10.2 or greater: https://github.com/pytorch/pytorch/tree/v1.13.1#prerequisites
PyTorch 2.0.0 requires CUDA 11.0 or greater: https://github.com/pytorch/pytorch/releases/tag/v2.0.0
As a Jetson TX2i uses CUDA 10.2, the latest and last version of PyTorch supported is 1.13.1 .
As of Fall 2024, the Jetson TX2i is no longer in use. Instead, a Raspberry Pi 5 will be used on the drone to run inference (CPU only). Hence, compatibility with the CUDA version no longer matters for the inference portion.
Â
No Forward Compatibility
PyTorch does not support forward compatibility. Specifically, the case where a model is created on a newer version and used for inference on an older version is not supported: Cannot load pytorch1.7 model in pytorch1.6
To use a model for inference on a Jetson TX2i, which requires PyTorch 1.13.1, the training code must use PyTorch 1.13.1 or earlier to create the model.
Releases
Each PyTorch release version has a number of precompiled package variants with different versions of CUDA as well as a CPU only variant. Models trained on the same PyTorch release version are the same regardless of variant.
Specifically, PyTorch 1.13.1 has the following variants:
CUDA 11.6
CUDA 11.7
CPU
There is no difference in CUDA compute capability support between CUDA 11.6 and 11.7 , so CUDA 11.7 is preferred. However, as CUDA 11.7 only supports hardware up to CUDA compute capability 8.7 , a computer with newer NVIDIA hardware must either use the CPU variant or a user-compiled package with a later version of CUDA.
Ultralytics
Ultralytics is not affected by PyTorch versioning as long as dependency requirements are satisfied. PyTorch 1.13.1 meets these requirements.
Upgrading
There are several options:
Do nothing: Do not upgrade the training computer’s NVIDIA hardware beyond compute capability 8.7
Easy but restricts hardware
Build PyTorch 1.13.1 with CUDA 11.8 or greater to support newer NVIDIA hardware for training
Supports all hardware but difficult to implement
Purchase newer Jetson with CUDA 11.7 support to run PyTorch 2.0.0 or greater, which has release variants with CUDA 11.8 or greater
Supports all hardware but costs additional money
Hack or develop an operating system and/or drivers for Jetson TX2i that supports CUDA 11.7
Supports all hardware but very difficult to implement
Â