Skip to content

Conversation

@alexbarghi-nv
Copy link
Member

@alexbarghi-nv alexbarghi-nv commented Dec 2, 2024

Allows sampling of heterogeneous graphs.

Removes unbuffered sampling from the PyG examples and completely disables it in DGL. A future PR will completely drop PyG support for unbuffered sampling, and a future cugraph PR will drop support for unbuffered sampling in the distributed sampler.

Merge after rapidsai/cugraph#4795

Closes rapidsai/cugraph#4402

@alexbarghi-nv alexbarghi-nv changed the base branch from branch-24.12 to branch-25.02 December 2, 2024 17:11
@alexbarghi-nv alexbarghi-nv self-assigned this Dec 2, 2024
@alexbarghi-nv alexbarghi-nv added breaking Introduces a breaking change feature request New feature or request labels Dec 2, 2024
@alexbarghi-nv alexbarghi-nv added this to the 25.02 milestone Dec 2, 2024
@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 23, 2024

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@alexbarghi-nv alexbarghi-nv marked this pull request as ready for review January 13, 2025 15:52
@alexbarghi-nv alexbarghi-nv requested a review from a team as a code owner January 13, 2025 15:52

loader = NeighborLoader(
(feature_store, graph_store),
num_neighbors=[0, 1, 0, 1],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the 0 fanouts here used to test edge cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used to test whether we can properly exclude an edge type.

@alexbarghi-nv alexbarghi-nv requested review from a team as code owners January 15, 2025 15:35
Copy link
Member

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put up some suggestions on the notebook testing.

--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION}" \
--prepend-channel "${CPP_CHANNEL}" \
--prepend-channel "${PYTHON_CHANNEL}" \
| tee env.yaml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point in this script, CPP_CHANNEL and PYTHON_CHANNEL haven't yet been set. If you want the downloaded CI artifacts to be considered in the conda solve, you'll have to move this block from lower down up above this:

rapids-logger "Downloading artifacts from previous jobs"
CPP_CHANNEL=$(rapids-download-conda-from-s3 cpp)
PYTHON_CHANNEL=$(rapids-download-conda-from-s3 python)

If you do that, then it'd also be good to remove the rapids-mamba-retry install cugraph-dgl in favor of that coming through this, so there will be a single call to create the environment.

Here's an example: rapidsai/ucx-py#1101

Comment on lines 57 to 58
- depends_on_cudf
- depends_on_cugraph
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this, I think here you want a depends_on_cugraph_dgl, so that a requirement like this gets added to the env.yaml:

  - cugraph-dgl==25.2.*,>=0.0.0a0

Then cudf and cugraph will come through automatically as part of cugraph-dgl's required dependencies.

- cugraph ={{ minor_version }}

That's a better pattern for CI, because it allows us to catch packaging problems of the form "cugraph-dgl depends on cudf but doesn't explicitly declare it" or something like that.

cuspatial uses this "consolidated solves" approach, you could follow that project's example:

https://github.com/rapidsai/cuspatial/blob/dbdc75ddea8422c18441a63e9f1fc42230db301c/dependencies.yaml#L43-L52

https://github.com/rapidsai/cuspatial/blob/dbdc75ddea8422c18441a63e9f1fc42230db301c/ci/test_notebooks.sh#L8-L19

@jameslamb jameslamb requested review from jameslamb and removed request for AyodeAwe January 15, 2025 19:36
includes:
- cuda_version
- depends_on_pytorch
- depends_on_cugraph_dgl
Copy link
Member

@jameslamb jameslamb Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work as-is because this project's dependencies.yaml doesn't yet have a depends_on_cugraph_dgl.

Add this:

  depends_on_cugraph_dgl:
    common:
      - output_types: conda
        packages:
          - cugraph-dgl==25.2.*,>=0.0.0a0

Maybe here, after depends_on_cugraph:

https://github.com/alexbarghi-nv/cugraph-gnn/blob/9b19ee4b5407706dffc86ce971673133b33c63a4/dependencies.yaml#L543

It doesn't have to have as much stuff as depends_on_cugraph (for example), because we're only using it to reference conda packages.

Alternatively, you could avoid this depends_on_cugraph_dgl stuff (since this is the only reference) and instead add an item to the test_notebook: group (https://github.com/alexbarghi-nv/cugraph-gnn/blob/9b19ee4b5407706dffc86ce971673133b33c63a4/dependencies.yaml#L365C1-L372C16), so it'd look like this:

  test_notebook:
    common:
      - output_types: [conda, requirements]
        packages:
          - ipython
          - nbconvert
          - notebook>=0.5.0
          - ogb
      - output_types: [conda]
        packages:
          - cugraph-dgl==25.2.*,>=0.0.0a0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed now. Thanks for the great explanation @jameslamb !

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no prob, happy to help 😊

Co-authored-by: James Lamb <[email protected]>
Copy link
Member

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing the notebook jobs pass! (build link)

The only failing just is cugraph-dgl wheels testing, but that looks like a network error that'd be resolved by just re-running the job.

pip._vendor.urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='download.pytorch.org', port=443): Read timed out.

(build link)

I just restarted it. This should be good to merge once that passes.

Thanks for getting this fixed so quickly @alexbarghi-nv !

@alexbarghi-nv
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit a9ab8b4 into rapidsai:branch-25.02 Jan 16, 2025
82 checks passed
@alexbarghi-nv alexbarghi-nv deleted the hetero-pyg branch January 16, 2025 18:04
@jakirkham
Copy link
Member

Thanks all! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Introduces a breaking change feature request New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support Heterogeneous Sample Input in cuGraph-PyG

4 participants