-
Notifications
You must be signed in to change notification settings - Fork 34
[FEA] Support Heterogeneous Sampling in cuGraph-PyG #82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
|
||
| loader = NeighborLoader( | ||
| (feature_store, graph_store), | ||
| num_neighbors=[0, 1, 0, 1], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the 0 fanouts here used to test edge cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used to test whether we can properly exclude an edge type.
jameslamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put up some suggestions on the notebook testing.
| --matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION}" \ | ||
| --prepend-channel "${CPP_CHANNEL}" \ | ||
| --prepend-channel "${PYTHON_CHANNEL}" \ | ||
| | tee env.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point in this script, CPP_CHANNEL and PYTHON_CHANNEL haven't yet been set. If you want the downloaded CI artifacts to be considered in the conda solve, you'll have to move this block from lower down up above this:
rapids-logger "Downloading artifacts from previous jobs"
CPP_CHANNEL=$(rapids-download-conda-from-s3 cpp)
PYTHON_CHANNEL=$(rapids-download-conda-from-s3 python)If you do that, then it'd also be good to remove the rapids-mamba-retry install cugraph-dgl in favor of that coming through this, so there will be a single call to create the environment.
Here's an example: rapidsai/ucx-py#1101
dependencies.yaml
Outdated
| - depends_on_cudf | ||
| - depends_on_cugraph |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of this, I think here you want a depends_on_cugraph_dgl, so that a requirement like this gets added to the env.yaml:
- cugraph-dgl==25.2.*,>=0.0.0a0Then cudf and cugraph will come through automatically as part of cugraph-dgl's required dependencies.
| - cugraph ={{ minor_version }} |
That's a better pattern for CI, because it allows us to catch packaging problems of the form "cugraph-dgl depends on cudf but doesn't explicitly declare it" or something like that.
cuspatial uses this "consolidated solves" approach, you could follow that project's example:
| includes: | ||
| - cuda_version | ||
| - depends_on_pytorch | ||
| - depends_on_cugraph_dgl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work as-is because this project's dependencies.yaml doesn't yet have a depends_on_cugraph_dgl.
Add this:
depends_on_cugraph_dgl:
common:
- output_types: conda
packages:
- cugraph-dgl==25.2.*,>=0.0.0a0Maybe here, after depends_on_cugraph:
It doesn't have to have as much stuff as depends_on_cugraph (for example), because we're only using it to reference conda packages.
Alternatively, you could avoid this depends_on_cugraph_dgl stuff (since this is the only reference) and instead add an item to the test_notebook: group (https://github.com/alexbarghi-nv/cugraph-gnn/blob/9b19ee4b5407706dffc86ce971673133b33c63a4/dependencies.yaml#L365C1-L372C16), so it'd look like this:
test_notebook:
common:
- output_types: [conda, requirements]
packages:
- ipython
- nbconvert
- notebook>=0.5.0
- ogb
- output_types: [conda]
packages:
- cugraph-dgl==25.2.*,>=0.0.0a0There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fixed now. Thanks for the great explanation @jameslamb !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no prob, happy to help 😊
Co-authored-by: James Lamb <[email protected]>
jameslamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing the notebook jobs pass! (build link)
The only failing just is cugraph-dgl wheels testing, but that looks like a network error that'd be resolved by just re-running the job.
pip._vendor.urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='download.pytorch.org', port=443): Read timed out.
I just restarted it. This should be good to merge once that passes.
Thanks for getting this fixed so quickly @alexbarghi-nv !
|
/merge |
|
Thanks all! 🙏 |
Allows sampling of heterogeneous graphs.
Removes unbuffered sampling from the PyG examples and completely disables it in DGL. A future PR will completely drop PyG support for unbuffered sampling, and a future
cugraphPR will drop support for unbuffered sampling in the distributed sampler.Merge after rapidsai/cugraph#4795
Closes rapidsai/cugraph#4402