Skip to content

Prepare cuml for removal of deprecated raft apis#7561

Merged
rapids-bot[bot] merged 17 commits intorapidsai:mainfrom
aamijar:raft-deprecated-apis
Dec 15, 2025
Merged

Prepare cuml for removal of deprecated raft apis#7561
rapids-bot[bot] merged 17 commits intorapidsai:mainfrom
aamijar:raft-deprecated-apis

Conversation

@aamijar
Copy link
Copy Markdown
Member

@aamijar aamijar commented Dec 3, 2025

Resolves #7554, Depends on rapidsai/cuvs#1610 (CI won't pass until this is merged)

What does this PR do?

  1. Removes lingering unused raft headers that will be deprecated such as #include <raft/spatial/knn/knn.cuh>, #include <raft/distance/distance.cuh>, etc.
  2. Updates to raft::memory_type_from_pointer instead of the deprecated raft::spatial::knn::detail::utils::pointer_residency.
  3. Removes metric_processor from knn.hpp and knn.cu.
    The only special metric processing needed is for correlation distance which we can handle in knn.cu instead of using the class from processing.cuh from raft. The cosine distance is supported in ivf_flat and ivf_pq in cuvs so we do not need to use the innerproduct metric and special processing that was there before.
  4. Uses build_dendrogram_host from cuvs instead of raft.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Dec 3, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@aamijar aamijar added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Dec 3, 2025
@aamijar aamijar marked this pull request as ready for review December 3, 2025 10:11
@aamijar aamijar requested review from a team as code owners December 3, 2025 10:11
Copy link
Copy Markdown
Contributor

@viclafargue viclafargue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removes metric_processor from knn.hpp and knn.cu which is not needed since cuvs handles it internally.

Does that mean that the pre/post processing code in there was not interfering with what cuVS is doing and can safely be removed?

If so, shouldn't this be removed too?

#include <thrust/execution_policy.h>
#include <thrust/transform.h>

#include <cuvs/cluster/agglomerative.hpp> // build_dendrogram_host
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already included just below

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 5899a33


index->metric_processor->revert(query_array);

// perform post-processing to show the real distances
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about that post-processing, is it also needed ?

Comment on lines +114 to +118
auto indices_mem_type = raft::memory_type_from_pointer(knn_graph.knn_indices);
auto dists_mem_type = raft::memory_type_from_pointer(knn_graph.knn_dists);

return !raft::is_device_accessible(indices_mem_type) ||
!raft::is_device_accessible(dists_mem_type);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are tracking an issue in CI that makes the CUDA context invalid. It looks like it may be an illegal memory accesses that happens when UMAP is given a pre-computed KNN on host memory while the HMM feature enabled (making host pointers device accessible). We disabled pre-computed KNN on host from now, but ideally we would want to enable it while disabling it in the specific case of HMM enabled (we need to investigate the HMM case separately).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this one has a merge conflict anyway, since this bit of code was reverted. So will remove the changes here

@github-actions github-actions bot added the CMake label Dec 14, 2025
FORK rapidsai
PINNED_TAG ${rapids-cmake-checkout-tag}
FORK aamijar
PINNED_TAG raft-deprecated-apis
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert later

@aamijar
Copy link
Copy Markdown
Member Author

aamijar commented Dec 14, 2025

I've updated the PR so that pre and post processing code should functionally be the same. The goal is to remove the dependency on processing.cuh from raft which was implicitly included in #include <raft/spatial/knn/ann.cuh>. So I've taken the bits of code from processing.cuh as put it in knn.cu directly.

Copy link
Copy Markdown
Contributor

@viclafargue viclafargue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aamijar! Just two questions on the KNN implementation.

// ANN index
if (metric == ML::distance::DistanceType::CosineExpanded ||
metric == ML::distance::DistanceType::CorrelationExpanded) {
metric = index->metric = ML::distance::DistanceType::InnerProduct;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know why we used to set index->metric?

Copy link
Copy Markdown
Member Author

@aamijar aamijar Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

index->metric is still used as usual at the beginning of the function where it is set to
index->metric = metric then forwarded to cuvs. The reason we set it to inner product in the if statement before is because of the special processing that was required since cuvs ivf-flat and ivf-pq didn't support cosine or correlation. So to do the equivalent computation we used inner product + pre/post processing.

Copy link
Copy Markdown
Contributor

@viclafargue viclafargue Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, if I understood correctly CorrelationExpanded is not supported. We implement pre/post processing here, but should pass InnerProduct metric to cuVS, right? It looks like in this case metric = InnerProduct, but index->metric = CorrelationExpanded. Why don't we do metric = index->metric = InnerProduct?

Copy link
Copy Markdown
Member Author

@aamijar aamijar Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metric variable is what is actually passed to cuvs. index->metric is just recording locally what the original metric was.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, metric is what is sent to cuVS anyway so it should work, but index->metric is kept to CorrelationExpanded unlike before, but I guess this is just an implementation detail and is made on purpose.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's what I understood

auto stream = raft::resource::get_cuda_stream(handle);

// For correlation: preprocess (center + normalize), use InnerProduct, then revert
if (metric == ML::distance::DistanceType::CorrelationExpanded) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CorrelationExpanded case seems to be handled correctly. Most metrics would use the DefaultMetricProcessor that does not do anything. But, unless I am missing something it looks like the CorrelationExpanded is not handled atm.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean CosineExpanded is not handled? CosineExpanded does have support in cuvs ivf-flat and ivf-pq now so we don't need to do special processing and override the metric to be inner product.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

@viclafargue viclafargue Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I meant CosineExpanded. Makes sense. What about Lp/L2 metrics postprocessing, is it not handled in cuVS? It looks like we are sending the metricArg to cuVS.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes L2 is handled in cuvs, but Lp is not. I've updated the code to only do processing on Lp.

@github-actions github-actions bot removed the CMake label Dec 15, 2025
Copy link
Copy Markdown
Contributor

@viclafargue viclafargue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@aamijar
Copy link
Copy Markdown
Member Author

aamijar commented Dec 15, 2025

/merge

@rapids-bot rapids-bot bot merged commit cdc006c into rapidsai:main Dec 15, 2025
106 checks passed
rapids-bot bot pushed a commit to rapidsai/raft that referenced this pull request Dec 17, 2025
…ghbors/` apis (#2885)

Supersedes #2813 and #2878
Resolves #2737 and Resolves #2872

Marking as **breaking** to get more eyes on this and mitigate risk.
This PR should not break downstream libraries as long as we merge the updates to them first: rapidsai/cuvs#1610, rapidsai/cuml#7561. 
I've found a usage of breaking api in FAISS here:
https://github.com/facebookresearch/faiss/blob/1721ebff6de6ed5a8481302123479be9d85059a2/faiss/gpu/GpuDistance.cu#L46.
https://github.com/facebookresearch/faiss/blob/5b19fca3f057b837ac898af52a8eb801c4744892/faiss/gpu/impl/CuvsFlatIndex.cu#L34

What does this PR do?
1. Removes `cluster/`, `distance/`, `neighbors/` (except `detail/faiss_select/`), `sparse/neighbors/`, `spatial/`
2. Removes unused includes that will be deprecated such as `#include <raft/distance/distance.cuh>`, `#include <raft/spatial/knn/knn.cuh>`, etc.
3. Removes legacy lanczos solver (`linalg/lanczos`, `sparse/linalg/lanczos` old functions in `sparse/solver/lanczos`) and removes legacy spectral apis (`spectral/ ` except modularity_maximization and partition which are metrics used by cugraph)
4. Removes corresponding gtests, raft_runtime, and bench files.

Authors:
  - Anupam (https://github.com/aamijar)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #2885
mani-builds pushed a commit to mani-builds/cuml that referenced this pull request Jan 11, 2026
Resolves rapidsai#7554, Depends on rapidsai/cuvs#1610 (CI won't pass until this is merged)

What does this PR do?

1. Removes lingering **unused** raft headers that will be deprecated such as `#include <raft/spatial/knn/knn.cuh>`, `#include <raft/distance/distance.cuh>`, etc.
2. ~~Updates to raft::memory_type_from_pointer instead of the deprecated raft::spatial::knn::detail::utils::pointer_residency.~~
3. Removes `metric_processor` from `knn.hpp` and `knn.cu`. 
The only special metric processing needed is for correlation distance which we can handle in `knn.cu` instead of using the class from `processing.cuh` from raft. The cosine distance is supported in ivf_flat and ivf_pq in cuvs so we do **not** need to use the innerproduct metric and special processing that was there before.
4. Uses `build_dendrogram_host` from cuvs instead of raft.

Authors:
  - Anupam (https://github.com/aamijar)

Approvers:
  - Victor Lafargue (https://github.com/viclafargue)

URL: rapidsai#7561
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CUDA/C++ improvement Improvement / enhancement to an existing function non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[TRACKER] Prepare cuml for removal of deprecated raft apis

4 participants