-
Notifications
You must be signed in to change notification settings - Fork 34
Build and test with CUDA 13.0.0 #286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
While trying to add CUDA 13 builds here, I discovered that this project's `build.sh` script has a hard-coded list of CUDA architectures: https://github.com/rapidsai/cugraph-gnn/blob/c50d56d0d4987c587e6f377dd4eb9537f3d0440b/build.sh#L212 As described in #286 (comment), this proposes removing that hard-coded list in favor of using the `"RAPIDS"` set of architectures from `rapids-cmake`. This also updates some `pre-commit` hooks versions (unrelated, but low-risk and wanted to take advantage of the CI runs). ## Notes for Reviewers ### Benefits of this change * keeps this project aligned with the rest of RAPIDS on CUDA architectures - reduces the manual effort required to support new CUDA versions - avoids the build time and binary size from building for older architectures that RAPIDS no longer supports ### So what architectures will `libwholegraph` now be built for? ```text # before 70-real;75-real;80-real;86-real;90 # after (CUDA 12) 70-real;75-real;80-real;86-real;90a-real;100f-real;120a-real;120 # after (CUDA 13) 75-real;80-real;86-real;90a-real;100f-real;120a-real;120 ``` Those lists come from here: https://github.com/rapidsai/rapids-cmake/blob/0b111489d1e6f8400e1fc88297623a2a9915fa77/rapids-cmake/cuda/set_architectures.cmake Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) - https://github.com/linhu-nv URL: #295
alexbarghi-nv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if we build with CUDA 13 but try to install the current PyTorch? Shouldn't PyT still work? Do they conflict on package versions? I'd rather we keep conda tests going if possible, plus we currently have a customer that's using nightly conda and I don't want them to get broken.
vyasr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please open an issue with all the follow-up work associated with PyTorch and add it to our CUDA 13 rollout issue so that we know to keep track of that for our overall CUDA 13 rollout. Aside from that, LGTM, thanks!
|
/merge |
ef26ed9
into
rapidsai:branch-25.10
|
Put up #296 documenting the need to add those tests, and linked it in the big task list on rapidsai/build-planning#208 |
Contributes to rapidsai/build-planning#208 Updates the `:latest` and `:25.10-latest` tags to CUDA 13.0.0. ## Notes for Reviewers ### is this safe to merge? Once these are in, I think yes: * [x] NVIDIA/cuopt#366 * [x] rapidsai/cugraph-gnn#286 At that point, the only thing it should affect are docs builds across repos that are already supporting CUDA 13 in all their other conda-based tests. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #303
Contributes to rapidsai/build-planning#208
cuda-python:>=12.9.2(CUDA 12),>=13.0.1(CUDA 13)cupy:>=13.6.0Contributes to rapidsai/build-planning#68
dependencies.yamlmatrices (i.e., the ones that get written topyproject.tomlin source control)Notes for Reviewers
This switches GitHub Actions workflows to the
cuda13.0branch from here: rapidsai/shared-workflows#413A future round of PRs will revert that back to
branch-25.10, once all of RAPIDS supports CUDA 13.What about PyTorch?
There are now PyTorch CUDA 13 nightly wheels, but not yet conda packages.
CUDA 13 support is tracked in pytorch/pytorch#159779, and eventually will show up as PRs in https://github.com/conda-forge/pytorch-cpu-feedstock (ignore the feedstock name... that is really where the CUDA-enabled builds are too).
This PR proposes skipping CUDA 13 conda tests, so we can at least start producing nightly CUDA 13 packages here. If reviewers agree, I'll open an issue documenting the need to restore those tests.