Skip to content

[CI] Build and test rocm Python packages as part of ci.yml #1559

@ScottTodd

Description

@ScottTodd

We've had issues like #1347 and #1552 go undetected since we only install the rocm packages we built and run rocm-sdk test during PyTorch release builds:

  • - name: Run rocm-sdk sanity tests
    run: |
    rocm-sdk test
  • print("+++ Sanity checking installed torch (unavailable is okay on CPU machines):")
    sanity_check_output = capture(
    [sys.executable, "-c", "import torch; print(torch.cuda.is_available())"],
    cwd=tempfile.gettempdir(),
    )
    if not sanity_check_output:
    raise RuntimeError("torch package sanity check failed (see output above)")
    else:
    print(f"Sanity check output:\n{sanity_check_output}")

The ci.yml workflow that we run on pull requests and merged commits can do more than just build and test native ROCm packages, it could build and test ROCm Python packages too. Looking at these Windows workflows, we could merge them or have one reuse the other:

The build step that is missing is:

      - name: Build Python Packages
        run: |
          python ./build_tools/build_python_packages.py \
            --artifact-dir=${{ env.BUILD_DIR }}/artifacts \
            --dest-dir=${{ env.BUILD_DIR }}/packages \
            --version=${{ needs.setup_metadata.outputs.version }}

See a recent release workflow run for an example of that: https://github.com/ROCm/TheRock/actions/runs/17904146218/job/50902443944#step:15:66. Note that it takes about 7 minutes, after a 1h40m build (62% cache hits)

Beyond just building, we probably want to upload wheels to some sort of dev release bucket and then install them for testing on machines with GPUs, and then also trigger dev PyTorch builds and tests eventually too.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

TODO

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions