Prepare cuml for removal of deprecated raft apis#7561
Prepare cuml for removal of deprecated raft apis#7561rapids-bot[bot] merged 17 commits intorapidsai:mainfrom
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
viclafargue
left a comment
There was a problem hiding this comment.
Removes metric_processor from knn.hpp and knn.cu which is not needed since cuvs handles it internally.
Does that mean that the pre/post processing code in there was not interfering with what cuVS is doing and can safely be removed?
If so, shouldn't this be removed too?
cpp/tests/sg/hdbscan_test.cu
Outdated
| #include <thrust/execution_policy.h> | ||
| #include <thrust/transform.h> | ||
|
|
||
| #include <cuvs/cluster/agglomerative.hpp> // build_dendrogram_host |
There was a problem hiding this comment.
This is already included just below
|
|
||
| index->metric_processor->revert(query_array); | ||
|
|
||
| // perform post-processing to show the real distances |
There was a problem hiding this comment.
How about that post-processing, is it also needed ?
cpp/include/cuml/manifold/common.hpp
Outdated
| auto indices_mem_type = raft::memory_type_from_pointer(knn_graph.knn_indices); | ||
| auto dists_mem_type = raft::memory_type_from_pointer(knn_graph.knn_dists); | ||
|
|
||
| return !raft::is_device_accessible(indices_mem_type) || | ||
| !raft::is_device_accessible(dists_mem_type); |
There was a problem hiding this comment.
We are tracking an issue in CI that makes the CUDA context invalid. It looks like it may be an illegal memory accesses that happens when UMAP is given a pre-computed KNN on host memory while the HMM feature enabled (making host pointers device accessible). We disabled pre-computed KNN on host from now, but ideally we would want to enable it while disabling it in the specific case of HMM enabled (we need to investigate the HMM case separately).
There was a problem hiding this comment.
Yes, this one has a merge conflict anyway, since this bit of code was reverted. So will remove the changes here
cpp/cmake/thirdparty/get_raft.cmake
Outdated
| FORK rapidsai | ||
| PINNED_TAG ${rapids-cmake-checkout-tag} | ||
| FORK aamijar | ||
| PINNED_TAG raft-deprecated-apis |
|
I've updated the PR so that pre and post processing code should functionally be the same. The goal is to remove the dependency on |
viclafargue
left a comment
There was a problem hiding this comment.
Thanks @aamijar! Just two questions on the KNN implementation.
| // ANN index | ||
| if (metric == ML::distance::DistanceType::CosineExpanded || | ||
| metric == ML::distance::DistanceType::CorrelationExpanded) { | ||
| metric = index->metric = ML::distance::DistanceType::InnerProduct; |
There was a problem hiding this comment.
Do you know why we used to set index->metric?
There was a problem hiding this comment.
index->metric is still used as usual at the beginning of the function where it is set to
index->metric = metric then forwarded to cuvs. The reason we set it to inner product in the if statement before is because of the special processing that was required since cuvs ivf-flat and ivf-pq didn't support cosine or correlation. So to do the equivalent computation we used inner product + pre/post processing.
There was a problem hiding this comment.
But, if I understood correctly CorrelationExpanded is not supported. We implement pre/post processing here, but should pass InnerProduct metric to cuVS, right? It looks like in this case metric = InnerProduct, but index->metric = CorrelationExpanded. Why don't we do metric = index->metric = InnerProduct?
There was a problem hiding this comment.
There was a problem hiding this comment.
metric variable is what is actually passed to cuvs. index->metric is just recording locally what the original metric was.
There was a problem hiding this comment.
Yes, metric is what is sent to cuVS anyway so it should work, but index->metric is kept to CorrelationExpanded unlike before, but I guess this is just an implementation detail and is made on purpose.
There was a problem hiding this comment.
Yes, that's what I understood
| auto stream = raft::resource::get_cuda_stream(handle); | ||
|
|
||
| // For correlation: preprocess (center + normalize), use InnerProduct, then revert | ||
| if (metric == ML::distance::DistanceType::CorrelationExpanded) { |
There was a problem hiding this comment.
The CorrelationExpanded case seems to be handled correctly. Most metrics would use the DefaultMetricProcessor that does not do anything. But, unless I am missing something it looks like the CorrelationExpanded is not handled atm.
There was a problem hiding this comment.
Did you mean CosineExpanded is not handled? CosineExpanded does have support in cuvs ivf-flat and ivf-pq now so we don't need to do special processing and override the metric to be inner product.
There was a problem hiding this comment.
Yes, I meant CosineExpanded. Makes sense. What about Lp/L2 metrics postprocessing, is it not handled in cuVS? It looks like we are sending the metricArg to cuVS.
There was a problem hiding this comment.
Yes L2 is handled in cuvs, but Lp is not. I've updated the code to only do processing on Lp.
|
/merge |
…ghbors/` apis (#2885) Supersedes #2813 and #2878 Resolves #2737 and Resolves #2872 Marking as **breaking** to get more eyes on this and mitigate risk. This PR should not break downstream libraries as long as we merge the updates to them first: rapidsai/cuvs#1610, rapidsai/cuml#7561. I've found a usage of breaking api in FAISS here: https://github.com/facebookresearch/faiss/blob/1721ebff6de6ed5a8481302123479be9d85059a2/faiss/gpu/GpuDistance.cu#L46. https://github.com/facebookresearch/faiss/blob/5b19fca3f057b837ac898af52a8eb801c4744892/faiss/gpu/impl/CuvsFlatIndex.cu#L34 What does this PR do? 1. Removes `cluster/`, `distance/`, `neighbors/` (except `detail/faiss_select/`), `sparse/neighbors/`, `spatial/` 2. Removes unused includes that will be deprecated such as `#include <raft/distance/distance.cuh>`, `#include <raft/spatial/knn/knn.cuh>`, etc. 3. Removes legacy lanczos solver (`linalg/lanczos`, `sparse/linalg/lanczos` old functions in `sparse/solver/lanczos`) and removes legacy spectral apis (`spectral/ ` except modularity_maximization and partition which are metrics used by cugraph) 4. Removes corresponding gtests, raft_runtime, and bench files. Authors: - Anupam (https://github.com/aamijar) Approvers: - Bradley Dice (https://github.com/bdice) - Corey J. Nolet (https://github.com/cjnolet) URL: #2885
Resolves rapidsai#7554, Depends on rapidsai/cuvs#1610 (CI won't pass until this is merged) What does this PR do? 1. Removes lingering **unused** raft headers that will be deprecated such as `#include <raft/spatial/knn/knn.cuh>`, `#include <raft/distance/distance.cuh>`, etc. 2. ~~Updates to raft::memory_type_from_pointer instead of the deprecated raft::spatial::knn::detail::utils::pointer_residency.~~ 3. Removes `metric_processor` from `knn.hpp` and `knn.cu`. The only special metric processing needed is for correlation distance which we can handle in `knn.cu` instead of using the class from `processing.cuh` from raft. The cosine distance is supported in ivf_flat and ivf_pq in cuvs so we do **not** need to use the innerproduct metric and special processing that was there before. 4. Uses `build_dendrogram_host` from cuvs instead of raft. Authors: - Anupam (https://github.com/aamijar) Approvers: - Victor Lafargue (https://github.com/viclafargue) URL: rapidsai#7561
Resolves #7554, Depends on rapidsai/cuvs#1610 (CI won't pass until this is merged)
What does this PR do?
#include <raft/spatial/knn/knn.cuh>,#include <raft/distance/distance.cuh>, etc.Updates to raft::memory_type_from_pointer instead of the deprecated raft::spatial::knn::detail::utils::pointer_residency.metric_processorfromknn.hppandknn.cu.The only special metric processing needed is for correlation distance which we can handle in
knn.cuinstead of using the class fromprocessing.cuhfrom raft. The cosine distance is supported in ivf_flat and ivf_pq in cuvs so we do not need to use the innerproduct metric and special processing that was there before.build_dendrogram_hostfrom cuvs instead of raft.