Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
f5ea1b6
Added EMI and its test
Dec 17, 2025
01b56a2
Added code for Adjusted MI
Dec 17, 2025
1a7feaa
Completed adj_MI
Dec 29, 2025
110147b
Working _adjusted_mi on cupy
Dec 31, 2025
9ab6d93
__init__ file in metrics
Jan 2, 2026
ad9a3c5
Added tests
Jan 2, 2026
01e8c64
Remove alpha specs from non-RAPIDS dependencies (#7568)
bdice Dec 8, 2025
111abe8
Remove unnecessary `syncthreads` in UMAP optimize (#7575)
jinsolp Dec 9, 2025
c82d9ae
Revert precomputed kNN on CPU for UMAP (#7576)
jinsolp Dec 9, 2025
94dda13
Use `sklearn.datasets.fetch_covtype` in example notebook (#7581)
jcrist Dec 9, 2025
4298c33
Don't link to cumlprims_mg when MG is turned off (#7583)
robertmaynard Dec 9, 2025
14d7adc
Enable merge barriers (#7579)
KyleFromNVIDIA Dec 9, 2025
818e954
Update to numba-cuda `>=0.22.1,<0.23.0` (#7549)
brandon-b-miller Dec 10, 2025
d41e29a
Pin scikit-learn<1.8 in test dependencies (#7588)
csadorf Dec 10, 2025
df05a91
Update Changelog [skip ci]
AyodeAwe Dec 10, 2025
9bc4bda
Simplify type reflection implementation (#7539)
jcrist Dec 10, 2025
f1d2723
UMAP fix int32 overflow causing illegal mem access (#7587)
aamijar Dec 11, 2025
e09e133
Add devcontainer fallback for C++ test location (#7596)
bdice Dec 12, 2025
c20dd0d
Explicitly initialize uninitialized SvmParameter fields in SVC constr…
dantegd Dec 12, 2025
f5e08c8
Add lower bound to dask-ml>=2024 (#7600)
bdice Dec 12, 2025
10bd34e
Forward `cuml.accel` logging to subprocesses (#7602)
jcrist Dec 12, 2025
a6085f8
Improved condense hierarchy of HDBSCAN (#7459)
jinsolp Dec 12, 2025
59864b4
Isolating `test_umap_outliers` from precomputed kNN issues (#7594)
jinsolp Dec 12, 2025
4c5b91c
Fix scikit-learn 1.8 test incompatibilities (#7589)
csadorf Dec 15, 2025
3e69866
Prepare cuml for removal of deprecated raft apis (#7561)
aamijar Dec 15, 2025
9e220a5
Require scipy>=1.13.0. (#7605)
csadorf Dec 15, 2025
d6da27b
Deprecate `TotalIters` for SVC/SVR (#7612)
jcrist Dec 15, 2025
f178333
Remove deprecations for 26.02 (#7613)
jcrist Dec 15, 2025
62f1f83
Remove deprecated `normalize` parameter (#7611)
jcrist Dec 15, 2025
b0b550e
Update cuML to C++20 (#7619)
divyegala Dec 17, 2025
7eb5242
Avoid deprecated conversion of numpy array->scalar (#7618)
jcrist Dec 17, 2025
14e0726
Updating UMAP random_state doc (#7590)
viclafargue Dec 17, 2025
b473dad
Add warning in UMAP for nondeterministic behavior when `build_algo=nn…
jinsolp Dec 17, 2025
0ba0908
`k-means++` documented as a init option (#7615)
aamijar Dec 17, 2025
5278f08
FIX Remove custom CUDA kernels from naive bayes estimators (#7424)
betatim Dec 17, 2025
b4e01b2
Bypass data allocation to GPU given precomputed knn in UMAP (#7566)
jinsolp Dec 17, 2025
203c03c
Attempt to find `cmake-format-rapids-cmake.json` in `cpp/latest` (#7626)
divyegala Dec 18, 2025
052802a
Optional seed spectral embedding (#7608)
aamijar Dec 18, 2025
e506545
Cleanup naive bayes (#7623)
jcrist Dec 18, 2025
431c5a6
Don't fail on `cuda.core.experimental` deprecation warnings (#7629)
jcrist Dec 19, 2025
b744460
Empty commit to trigger a build (#7630)
bdice Dec 19, 2025
7a61d6a
Move cumlprims_mg into cuML (#7585)
dantegd Dec 22, 2025
af9dce8
Deprecate handle from public APIs (#7628)
jcrist Dec 22, 2025
abc5068
CI Make nightly cuml.accel integration test stricter (#7631)
betatim Dec 23, 2025
892d5d3
Use a generated dataset in place of the actual Covertype dataset. (#7…
csadorf Dec 31, 2025
a41b858
Remove reliance on remote datasets in tests (#7637)
csadorf Dec 31, 2025
686df69
Added comments
Jan 11, 2026
b09c1c1
Merge upstream changes
Jan 11, 2026
225aa3d
Merged with upstream
Jan 13, 2026
5915df5
Modified scipy to cupyx.scipy
Feb 25, 2026
b42b949
Fixed 'Segmentation Fault' issues with ami and emi
Feb 26, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
360 changes: 360 additions & 0 deletions build.sh.save
Original file line number Diff line number Diff line change
@@ -0,0 +1,360 @@
#!/bin/bash

# SPDX-FileCopyrightText: Copyright (c) 2019-2025, NVIDIA CORPORATION.
# SPDX-License-Identifier: Apache-2.0

# cuml build script

# This script is used to build the component(s) in this repo from
# source, and can be called with various options to customize the
# build as needed (see the help output for details)

# Abort script on first error
set -e

NUMARGS=$# ARGS=$*

# NOTE: ensure all dir changes are relative to the location of this
# script, and that this script resides in the repo dir!
REPODIR=$(cd "$(dirname "$0")"; pwd)

VALIDTARGETS="clean libcuml cuml cpp-mgtests prims bench prims-bench
cppdocs pydocs" VALIDFLAGS="-v -g -n --allgpuarch --singlegpu
--nolibcumltest --nvtx --show_depr_warn --codecov --ccache
--configure-only --build-metrics --incl-cache-stats -h --help "
VALIDARGS="${VALIDTARGETS} ${VALIDFLAGS}" HELP="$0 [<target> ...]
[<flag> ...]
where <target> is: clean - remove all existing build artifacts and
configuration (start over) libcuml - build the cuml C++ code only.
Also builds the C-wrapper library
around the C++ code. cuml - build the cuml
Python package cpp-mgtests - build libcuml mnmg tests. Builds MPI
communicator, adding MPI as dependency. prims - build the ml-prims
tests bench - build the libcuml C++ benchmark prims-bench - build
the ml-prims C++ benchmark cppdocs - build the C++ API doxygen
documentation pydocs - build the general and Python API
documentation
and <flag> is: -v - verbose build mode -g - build for debug -n - no
install step -h - print this text --allgpuarch - build for all
supported GPU architectures --singlegpu - Build libcuml and cuml
without multigpu components --nolibcumltest - disable building
libcuml C++ tests for a faster build --nvtx - Enable nvtx for
profiling support --show_depr_warn - show cmake deprecation
warnings --codecov - Enable code coverage support by compiling with
Cython linetracing
and profiling enabled (WARNING: Impacts
performance) --ccache - Use ccache to cache previous compilations
--configure-only - Invoke CMake without actually building
--static-treelite - Force CMake to use the Treelite static libs,
cloning and building them if necessary --build-metrics - filename
for generating build metrics report for libcuml --incl-cache-stats
- include cache statistics in build metrics report

default action (no args) is to build and install 'libcuml', 'cuml',
and 'prims' targets only for the detected GPU arch

The following environment variables are also accepted to allow
further customization: PARALLEL_LEVEL - Number of parallel threads
to use in compilation. CUML_EXTRA_CMAKE_ARGS - Extra arguments to
pass directly to cmake. Values listed in environment
variable will override existing arguments.
Example:
CUML_EXTRA_CMAKE_ARGS=\"-DBUILD_CUML_C_LIBRARY=OFF\"
./build.sh
CUML_EXTRA_PYTHON_ARGS - Extra arguments to pass directly to pip
install " LIBCUML_BUILD_DIR=${LIBCUML_BUILD_DIR:=${REPODIR}/cpp/build}
CUML_BUILD_DIR=${REPODIR}/python/cuml/build
PYTHON_DEPS_CLONE=${REPODIR}/python/external_repositories
BUILD_DIRS="${LIBCUML_BUILD_DIR} ${CUML_BUILD_DIR}
${PYTHON_DEPS_CLONE}"

# Set defaults for vars modified by flags to this script
BUILD_TYPE=Release INSTALL_TARGET=install BUILD_ALL_GPU_ARCH=0
SINGLEGPU_CPP_FLAG="" NVTX=OFF CCACHE=OFF CLEAN=0
BUILD_DISABLE_DEPRECATION_WARNINGS=ON BUILD_CUML_TESTS=ON
BUILD_CUML_MG_TESTS=OFF BUILD_STATIC_TREELITE=OFF
CMAKE_LOG_LEVEL=WARNING BUILD_REPORT_METRICS=OFF
BUILD_REPORT_INCL_CACHE_STATS=OFF

# Set defaults for vars that may not have been defined externally
INSTALL_PREFIX=${INSTALL_PREFIX:=${PREFIX:=${CONDA_PREFIX:=$LIBCUML_BUILD_DIR/install}}}
PARALLEL_LEVEL=${PARALLEL_LEVEL:=$(nproc)}

# Default to Ninja if generator is not specified
export CMAKE_GENERATOR="${CMAKE_GENERATOR:=Ninja}"

# Allow setting arbitrary cmake args via the $CUML_ADDL_CMAKE_ARGS variable. Any
# values listed here will override existing arguments. For example:
# CUML_EXTRA_CMAKE_ARGS="-DBUILD_CUML_C_LIBRARY=OFF" ./build.sh
# Will disable building the C library even though it is hard coded to ON
CUML_EXTRA_CMAKE_ARGS=${CUML_EXTRA_CMAKE_ARGS:=""}

read -ra CUML_EXTRA_CMAKE_ARGS <<< "$CUML_EXTRA_CMAKE_ARGS"

CUML_EXTRA_PYTHON_ARGS=${CUML_EXTRA_PYTHON_ARGS:=""}

read -ra CUML_EXTRA_PYTHON_ARGS <<< "$CUML_EXTRA_PYTHON_ARGS"

function hasArg {
(( NUMARGS != 0 )) && (echo " ${ARGS} " | grep -q " $1 ")
}

# Check for valid usage
if (( NUMARGS != 0 )); then
for a in ${ARGS}; do
if ! (echo " ${VALIDARGS} " | grep -q " ${a} "); then
echo "Invalid option or formatting, check --help: ${a}"
exit 1
fi
done
fi

function completeBuild {
(( NUMARGS == 0 )) && return
for a in ${ARGS}; do
if (echo " ${VALIDTARGETS} " | grep -q " ${a} "); then
false; return
fi
done
true
}

if hasArg -h || hasArg --help; then
echo "${HELP}"
exit 0
fi

if hasArg clean; then
CLEAN=1
fi

if hasArg cpp-mgtests; then
BUILD_CUML_MG_TESTS=ON
fi

# Long arguments
LONG_ARGUMENT_LIST=(
"verbose"
"debug"
"no-install"
"allgpuarch"
"singlegpu"
"nvtx"
"show_depr_warn"
"codecov"
"ccache"
"nolibcumltest"
"configure-only"
"build-metrics"
"incl-cache-stats"
)

# Short arguments
ARGUMENT_LIST=(
"v"
"g"
"n"
)

# read arguments
opts=$(getopt \
--longoptions "$(printf "%s," "${LONG_ARGUMENT_LIST[@]}")" \
--name "$(basename "$0")" \
--options "$(printf "%s" "${ARGUMENT_LIST[@]}")" \
-- "$@"
)

# shellcheck disable=SC2181
if [ $? != 0 ] ; then echo "Terminating..." >&2 ; exit 1 ; fi

eval set -- "$opts"

while true; do
case "$1" in
-h)
show_help
exit 0
;;
-v | --verbose )
VERBOSE_FLAG="-v"
CMAKE_LOG_LEVEL=VERBOSE
;;
-g | --debug )
BUILD_TYPE=RelWithDebInfo
;;
-n | --no-install )
INSTALL_TARGET=""
;;
--allgpuarch )
BUILD_ALL_GPU_ARCH=1
;;
--singlegpu )
CUML_EXTRA_PYTHON_ARGS+=("--singlegpu")
SINGLEGPU_CPP_FLAG=ON
;;
--nvtx )
NVTX=ON
;;
--show_depr_warn )
BUILD_DISABLE_DEPRECATION_WARNINGS=OFF
;;
--codecov )
CUML_EXTRA_PYTHON_ARGS+=("--linetrace=1" "--profile")
;;
--ccache )
CCACHE=ON
;;
--nolibcumltest )
BUILD_CUML_TESTS=OFF
;;
--static-treelite )
BUILD_STATIC_TREELITE=ON
;;
--build-metrics )
BUILD_REPORT_METRICS=ON
;;
--incl-cache-stats )
BUILD_REPORT_INCL_CACHE_STATS=ON
;;
--)
shift
break
;;
esac
shift
done

# If clean given, run it prior to any other steps
if (( CLEAN == 1 )); then
# If the dirs to clean are mounted dirs in a container, the
# contents should be removed but the mounted dirs will remain.
# The find removes all contents but leaves the dirs, the rmdir
# attempts to remove the dirs but can fail safely.
for bd in ${BUILD_DIRS}; do
if [ -d "${bd}" ]; then
find "${bd}" -mindepth 1 -delete
rmdir "${bd}" || true
fi
done

# Clean up python artifacts
find "${REPODIR}"/python/ | grep -E "(__pycache__|\.pyc|\.pyo|\.so|\_skbuild)$" | xargs rm -rf

# Remove Doxyfile
rm -rf "${REPODIR}"/cpp/Doxyfile

# Remove .benchmark dirs and .pytest_cache
find "${REPODIR}"/ | grep -E "(\.pytest_cache|\.benchmarks)$" | xargs rm -rf
fi


################################################################################
# Configure for building all C++ targets
if completeBuild || hasArg libcuml || hasArg prims || hasArg bench || hasArg prims-bench || hasArg cppdocs || hasArg cpp-mgtests; then
if (( BUILD_ALL_GPU_ARCH == 0 )); then
CUML_CMAKE_CUDA_ARCHITECTURES="NATIVE"
echo "Building for the architecture of the GPU in the system..."
else
CUML_CMAKE_CUDA_ARCHITECTURES="RAPIDS"
echo "Building for *ALL* supported GPU architectures..."
fi

mkdir -p "${LIBCUML_BUILD_DIR}"
cd "${LIBCUML_BUILD_DIR}"

cmake -DCMAKE_INSTALL_PREFIX="${INSTALL_PREFIX}" \
-DCMAKE_CUDA_ARCHITECTURES=${CUML_CMAKE_CUDA_ARCHITECTURES} \
-DCMAKE_BUILD_TYPE=${BUILD_TYPE} \
-DBUILD_CUML_C_LIBRARY=ON \
-DSINGLEGPU=${SINGLEGPU_CPP_FLAG} \
-DCUML_ALGORITHMS="ALL" \
-DBUILD_CUML_TESTS=${BUILD_CUML_TESTS} \
-DBUILD_CUML_MPI_COMMS=${BUILD_CUML_MG_TESTS} \
-DBUILD_CUML_MG_TESTS=${BUILD_CUML_MG_TESTS} \
-DCUML_USE_TREELITE_STATIC=${BUILD_STATIC_TREELITE} \
-DNVTX=${NVTX} \
-DUSE_CCACHE=${CCACHE} \
-DDISABLE_DEPRECATION_WARNINGS=${BUILD_DISABLE_DEPRECATION_WARNINGS} \
-DCMAKE_PREFIX_PATH="${INSTALL_PREFIX}" \
-DCMAKE_MESSAGE_LOG_LEVEL=${CMAKE_LOG_LEVEL} \
"${CUML_EXTRA_CMAKE_ARGS[@]}" \
..
fi

# If `./build.sh cuml` is called, don't build C/C++ components
if (! hasArg --configure-only) && (completeBuild || hasArg libcuml || hasArg prims || hasArg bench || hasArg cpp-mgtests); then
# get the current count before the compile starts
CACHE_TOOL=${CACHE_TOOL:-sccache}
if [[ "$BUILD_REPORT_INCL_CACHE_STATS" == "ON" && -x "$(command -v "${CACHE_TOOL}")" ]]; then
"${CACHE_TOOL}" --zero-stats
fi

cd "${LIBCUML_BUILD_DIR}"
if [ -n "${INSTALL_TARGET}" ]; then
cmake --build "${LIBCUML_BUILD_DIR}" -j"${PARALLEL_LEVEL}" --target ${INSTALL_TARGET} "${VERBOSE_FLAG}"
else
cmake --build "${LIBCUML_BUILD_DIR}" -j"${PARALLEL_LEVEL}" "${VERBOSE_FLAG}"
fi

if [[ "$BUILD_REPORT_METRICS" == "ON" && -f "${LIBCUML_BUILD_DIR}/.ninja_log" ]]; then
if ! rapids-build-metrics-reporter.py 2> /dev/null && [ ! -f rapids-build-metrics-reporter.py ]; then
echo "Downloading rapids-build-metrics-reporter.py"
curl -sO https://raw.githubusercontent.com/rapidsai/build-metrics-reporter/v1/rapids-build-metrics-reporter.py
fi

echo "Formatting build metrics"
MSG=""
# get some sccache/ccache stats after the compile
if [[ "$BUILD_REPORT_INCL_CACHE_STATS" == "ON" ]]; then
if [[ ${CACHE_TOOL} == "sccache" && -x "$(command -v sccache)" ]]; then
COMPILE_REQUESTS=$(sccache -s | grep "Compile requests \+ [0-9]\+$" | awk '{ print $NF }')
CACHE_HITS=$(sccache -s | grep "Cache hits \+ [0-9]\+$" | awk '{ print $NF }')
HIT_RATE=$(COMPILE_REQUESTS="${COMPILE_REQUESTS}" CACHE_HITS="${CACHE_HITS}" python3 -c "import os; print(f'{int(os.getenv(\"CACHE_HITS\")) / int(os.getenv(\"COMPILE_REQUESTS\")):.2f}' if int(os.getenv(\"COMPILE_REQUESTS\")) else 'nan')")
MSG="${MSG}<br/>cache hit rate ${HIT_RATE} %"
elif [[ ${CACHE_TOOL} == "ccache" && -x "$(command -v ccache)" ]]; then
CACHE_STATS_LINE=$(ccache -s | grep "Hits: \+ [0-9]\+ / [0-9]\+" | tail -n1)
if [[ -n "$CACHE_STATS_LINE" ]]; then
CACHE_HITS=$(echo "$CACHE_STATS_LINE" - | awk '{ print $2 }')
COMPILE_REQUESTS=$(echo "$CACHE_STATS_LINE" - | awk '{ print $4 }')
HIT_RATE=$(COMPILE_REQUESTS="${COMPILE_REQUESTS}" CACHE_HITS="${CACHE_HITS}" python3 -c "import os; print(f'{int(os.getenv(\"CACHE_HITS\")) / int(os.getenv(\"COMPILE_REQUESTS\")):.2f}' if int(os.getenv(\"COMPILE_REQUESTS\")) else 'nan')")
MSG="${MSG}<br/>cache hit rate ${HIT_RATE} %"
fi
fi
fi
MSG="${MSG}<br/>parallel setting: $PARALLEL_LEVEL"
if [[ -f "${LIBCUML_BUILD_DIR}/libcuml++.so" ]]; then
LIBCUML_FS=$(find "${LIBCUML_BUILD_DIR}" -name libcuml++.so -printf '%s'| awk '{printf "%.2f MB", $1/1024/1024}')
MSG="${MSG}<br/>libcuml++.so size: $LIBCUML_FS"
fi
BMR_DIR=${RAPIDS_ARTIFACTS_DIR:-"${LIBCUML_BUILD_DIR}"}
echo "The HTML report can be found at [${BMR_DIR}/ninja_log.html]. In CI, this report"
echo "will also be uploaded to the appropriate subdirectory of https://downloads.rapids.ai/ci/cuml/, and"
echo "the entire URL can be found in \"conda-cpp-build\" runs under the task \"Upload additional artifacts\""
echo "Metrics output dir: [$BMR_DIR]"
mkdir -p "${BMR_DIR}"
MSG_OUTFILE="$(mktemp)"
echo "$MSG" > "${MSG_OUTFILE}"
PATH=".:$PATH" python rapids-build-metrics-reporter.py "${LIBCUML_BUILD_DIR}"/.ninja_log --fmt html --msg "${MSG_OUTFILE}" > "${BMR_DIR}"/ninja_log.html
cp "${LIBCUML_BUILD_DIR}"/.ninja_log "${BMR_DIR}"/ninja.log
fi
fi

if (! hasArg --configure-only) && hasArg cppdocs; then
cd "${LIBCUML_BUILD_DIR}"
cmake --build "${LIBCUML_BUILD_DIR}" --target docs_cuml
fi


# Build and (optionally) install the cuml Python package
if (! hasArg --configure-only) && (completeBuild || hasArg cuml || hasArg pydocs); then
# Replace spaces with semicolons in SKBUILD_EXTRA_CMAKE_ARGS
SKBUILD_EXTRA_CMAKE_ARGS=${SKBUILD_EXTRA_CMAKE_ARGS// /;}

SKBUILD_CMAKE_ARGS="-DCMAKE_MESSAGE_LOG_LEVEL=${CMAKE_LOG_LEVEL};${SKBUILD_EXTRA_CMAKE_ARGS}" \
python -m pip install --no-build-isolation --no-deps --config-settings rapidsai.disable-cuda=true "${REPODIR}"/python/cuml

if hasArg pydocs; then
cd "${REPODIR}"/docs
make html
fi
fi
2 changes: 1 addition & 1 deletion ci/test_notebooks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ rapids-logger "notebook tests cuml"

# Add notebooks that should be skipped here
# (space-separated list of filenames without paths)
SKIPNBS="cuml_benchmarks.ipynb hdbscan_soft_clustering_benchmark.ipynb"
SKIPNBS="cuml_benchmarks.ipynb hdbscan_soft_clustering_benchmark.ipynb forest_inference_demo.ipynb target_encoder_walkthrough.ipynb"
NBTEST="$(realpath "$(dirname "$0")/utils/nbtest.sh")"
REPO_ROOT="$(realpath "$(dirname "$0")/..")"
CI_NOTEBOOKS_DIR="${REPO_ROOT}/ci/notebooks"
Expand Down
10 changes: 9 additions & 1 deletion conda/environments/all_cuda-129_arch-aarch64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -64,19 +64,27 @@ dependencies:
- rapids-build-backend>=0.4.0,<0.5.0
- rapids-dask-dependency==26.2.*,>=0.0.0a0
- rapids-logger==0.2.*,>=0.0.0a0
- rapids-xgboost==26.2.*,>=0.0.0a0
- recommonmark
- rich
- rmm==26.2.*,>=0.0.0a0
<<<<<<< HEAD
- scikit-build-core>=0.10.0
- scikit-learn>=1.4
=======
- scikit-build-core>=0.11.0
- scikit-learn>=1.5
>>>>>>> upstream/main
- scipy>=1.13.0
- seaborn
- sphinx
- sphinx-copybutton
- sphinx-markdown-tables
- statsmodels
- sysroot_linux-aarch64==2.28
<<<<<<< HEAD
- tenacity
=======
>>>>>>> upstream/main
- treelite>=4.6.1,<5.0.0
- umap-learn==0.5.7
name: all_cuda-129_arch-aarch64
Loading