Skip to content

NVIDIA GeForce RTX 5090 Support #422

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
aliasaria opened this issue May 1, 2025 · 2 comments · Fixed by transformerlab/transformerlab-api#263
Closed

NVIDIA GeForce RTX 5090 Support #422

aliasaria opened this issue May 1, 2025 · 2 comments · Fixed by transformerlab/transformerlab-api#263
Labels
api Changes to be made to the transformerlab-api repo hardware Issues related to user hardware (GPUs, CPUs, Cloud)

Comments

@aliasaria
Copy link
Member

NVIDIA 5090's require an updated PyTorch to get CUDA 12.8. This issue is to track the work required to make sure we work on 5090's.

Transformer Lab currently installs CUDA 12.1 using conda in install.sh https://github.com/transformerlab/transformerlab-api/blob/main/install.sh

More documentation:

https://docs.salad.com/tutorials/pytorch-rtx5090

pytorch/pytorch#145949

https://forums.developer.nvidia.com/t/software-migration-guide-for-nvidia-blackwell-rtx-gpus-a-guide-to-cuda-12-8-pytorch-tensorrt-and-llama-cpp/321330

We need to investigate:

  • Upgrading CUDA for everyone -- will any existing plugins break? Certain tools like vLLM, FlashAttention2 etc are known to be weird with specific CUDA versions
  • Upgrading CUDA to 12.1 just for 5090 users -- how would we do this? What will work and what wont?
@aliasaria
Copy link
Member Author

From vLLM docs:

https://docs.vllm.ai/en/stable/getting_started/installation/gpu.html

As of now, vLLM’s binaries are compiled with CUDA 12.4 and public PyTorch release versions by default. We also provide vLLM binaries compiled with CUDA 12.1, 11.8, and public PyTorch release versions

vLLM is not a required part of our platform (it is only supported in a specific plugin) but it would be nice if we could still support it.

@aliasaria
Copy link
Member Author

From FlashAttention2
https://github.com/Dao-AILab/flash-attention

We highly recommend CUDA 12.8 for best performance. (when discussing H100 / H800 GPU but probably applies to all GPUs)

@deep1401 deep1401 added api Changes to be made to the transformerlab-api repo hardware Issues related to user hardware (GPUs, CPUs, Cloud) labels May 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Changes to be made to the transformerlab-api repo hardware Issues related to user hardware (GPUs, CPUs, Cloud)
Projects
None yet
2 participants