Skip to content

Conversation

@tmartin-gh
Copy link
Collaborator

No description provided.

@tmartin-gh tmartin-gh self-assigned this Oct 15, 2025
@copy-pr-bot
Copy link

copy-pr-bot bot commented Oct 15, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@tmartin-gh
Copy link
Collaborator Author

/build

3 similar comments
@tmartin-gh
Copy link
Collaborator Author

/build

@cliffburdick
Copy link
Collaborator

/build

@cliffburdick
Copy link
Collaborator

/build

Copy link
Collaborator

@cliffburdick cliffburdick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use a variable so it's only updated in one place?

@cliffburdick
Copy link
Collaborator

/build

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR attempts to update MatX's development and CI infrastructure to use CUDA 13.0.1, modifying three files: .devcontainer/setup.sh sets the default container version tag, .github/workflows/build-docs.yml updates the documentation build workflow container, and .devcontainer/recipe.py changes the base Docker image and CuPy dependency. These changes are part of a coordinated effort to standardize the project's container environment on a newer CUDA version, ensuring consistency across local development (devcontainer setup), CI documentation builds, and Python package dependencies. The changes follow the project's existing container-based development workflow structure, where .devcontainer/ defines the VS Code development container configuration and GitHub Actions workflows rely on pre-built release containers from ghcr.io/nvidia/matx/release.

Important Files Changed

Filename Score Overview
.devcontainer/recipe.py 1/5 Updates base Docker image to nvidia/cuda:13.0.1-devel-ubuntu24.04 and CuPy to cupy-cuda13x, but CUDA 13.0.1 does not exist and will cause build failures
.devcontainer/setup.sh 1/5 Changes default MATX_VERSION_TAG from 12.9.1_ubuntu24.04 to 13.0.1_ubuntu24.04, referencing a non-existent container version
.github/workflows/build-docs.yml 1/5 Updates documentation build container image to ghcr.io/nvidia/matx/release:13.0.1_ubuntu24.04-amd64, which does not exist

Confidence score: 0/5

  • This PR will immediately fail and cannot be merged - it references a non-existent CUDA version that will break all container builds and CI workflows
  • Score reflects critical blocker issues: CUDA 13.0.1does not exist (current CUDA versions are in the 12.x series), causing Docker image pull failures, pip package installation failures, and complete CI pipeline breakage across all three modified files
  • All files require immediate correction - the intended CUDA version should be verified and all references updated to an actual existing CUDA release (likely 12.6.x or similar) before this PR can proceed

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant GH as GitHub Actions
    participant Auth as Blossom Authorization
    participant Container as Docker Container<br/>(CUDA 13.0.1)
    participant Build as Build System
    participant Pages as GitHub Pages

    Note over Dev,Pages: Container Setup Flow (.devcontainer/setup.sh)
    Dev->>Dev: Run setup.sh
    Dev->>Dev: Set MATX_VERSION_TAG="13.0.1_ubuntu24.04"
    Dev->>Dev: Detect platform (x86_64/aarch64)
    Dev->>Dev: Configure container instance

    Note over Dev,Pages: Container Recipe Build Flow (.devcontainer/recipe.py)
    Dev->>Container: Pull nvidia/cuda:13.0.1-devel-ubuntu24.04
    Container->>Container: Install system packages (clang-tidy, cmake, etc)
    Container->>Container: Install GNU toolchain
    Container->>Container: Install CMake 3.30.4
    Container->>Container: Install Nsight Compute & Systems
    Container->>Container: Build & install Doxygen 1.14.0
    Container->>Container: Build & install FFTW 3.3.10 (float & double)
    Container->>Container: Install Coveralls
    Container->>Container: Build & install BLIS 1.0
    Container->>Container: Setup fixuid for user matx (uid:2000)
    Container->>Container: Create Python venv at /opt/nvidia/venv
    Container->>Container: Install Python packages (cupy-cuda13x, sphinx, etc)

    Note over Dev,Pages: Documentation Build & Deploy Flow (.github/workflows/build-docs.yml)
    Dev->>GH: Push to main branch
    GH->>Auth: Check authorization
    Auth->>Auth: Verify actor in approved list
    Auth-->>GH: Authorization granted
    GH->>Container: Pull ghcr.io/nvidia/matx/release:13.0.1_ubuntu24.04-amd64
    GH->>Container: Checkout repository
    Container->>Build: Create build directory
    Build->>Build: Run cmake with MATX_BUILD_DOCS=ON via /opt/nvidia/run_from_venv.sh
    Build->>Build: Run make to build documentation
    Build-->>GH: Documentation artifacts (build/docs_input/sphinx/)
    GH->>Pages: Configure GitHub Pages
    GH->>Pages: Upload documentation artifact
    GH->>Pages: Deploy to GitHub Pages
    Pages-->>Dev: Documentation URL available
Loading

3 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

@cliffburdick
Copy link
Collaborator

/build

@tmartin-gh
Copy link
Collaborator Author

Greptile Overview

Greptile Summary

but CUDA 13.0.1 does not exist and will cause build failures

Greptile is wrong, CUDA 13.0.1 most certainly does exist.

@cliffburdick cliffburdick merged commit 845c687 into NVIDIA:main Oct 25, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants