Skip to content

Cleanup naive bayes#7623

Merged
rapids-bot[bot] merged 7 commits intorapidsai:mainfrom
jcrist:cleanup-naive-bayes
Dec 18, 2025
Merged

Cleanup naive bayes#7623
rapids-bot[bot] merged 7 commits intorapidsai:mainfrom
jcrist:cleanup-naive-bayes

Conversation

@jcrist
Copy link
Copy Markdown
Member

@jcrist jcrist commented Dec 17, 2025

This applies the cleanups part of #7317 to cuml.naive_bayes. It's a followup to #7424.

Highlights:

  • Several followup fixes from FIX Remove custom CUDA kernels from naive bayes estimators #7424. Mostly ripping out debugging code that was leftover and no longer needed.
  • Removal of validation, conversion, and initialization of fitted attributes from __init__.
  • Fixed CumlArrayDescriptor definitions to only define fitted attributes for models that support them, rather than defining the same fitted attributes across all naive bayes estimators
  • Removal of cuml.prims.array, this module is no longer used
  • Simplification of code paths, removal of extraneous definitions
  • Docstring cleanups

Given this is a very low priority module, I didn't dwell too much on the individual implementations, instead only handling the end goals in #7317. There are still improvements that could be made to this module. In particular:

  • There are visible inefficiencies in the cupy code and category handling. Work is repeated, extraneous copies are made. Given the priorities here, I don't think this is worth working on unless someone suddenly needs these models to be much faster.
  • These classifiers still don't handle non-numeric inputs (like the rest of cuml does). Punting on this for now, but we likely do want to support this in the future (if for nothing else, I'd like to rip out cuml.prims.labels completely). I'll open a followup issue.

@jcrist jcrist self-assigned this Dec 17, 2025
@jcrist jcrist requested a review from a team as a code owner December 17, 2025 21:04
@jcrist jcrist requested a review from viclafargue December 17, 2025 21:04
@jcrist jcrist added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Dec 17, 2025
@github-actions github-actions bot added the Cython / Python Cython or Python issue label Dec 17, 2025
@jcrist jcrist requested review from betatim and removed request for viclafargue December 17, 2025 21:04
Copy link
Copy Markdown
Contributor

@viclafargue viclafargue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just two questions.

@jcrist
Copy link
Copy Markdown
Member Author

jcrist commented Dec 18, 2025

/merge

@rapids-bot rapids-bot bot merged commit 7bb51db into rapidsai:main Dec 18, 2025
103 of 105 checks passed
mani-builds pushed a commit to mani-builds/cuml that referenced this pull request Jan 11, 2026
This applies the cleanups part of rapidsai#7317 to `cuml.naive_bayes`. It's a followup to rapidsai#7424.

Highlights:
- Several followup fixes from rapidsai#7424. Mostly ripping out debugging code that was leftover and no longer needed.
- Removal of validation, conversion, and initialization of fitted attributes from `__init__`.
- Fixed `CumlArrayDescriptor` definitions to only define fitted attributes for models that support them, rather than defining the same fitted attributes across all naive bayes estimators
- Removal of `cuml.prims.array`, this module is no longer used
- Simplification of code paths, removal of extraneous definitions
- Docstring cleanups

Given this is a very low priority module, I didn't dwell too much on the individual implementations, instead only handling the end goals in rapidsai#7317. _There are still improvements that could be made to this module_. In particular:

- There are visible inefficiencies in the cupy code and category handling. Work is repeated, extraneous copies are made. Given the priorities here, I don't think this is worth working on unless someone suddenly needs these models to be much faster.
- These classifiers still don't handle non-numeric inputs (like the rest of cuml does). Punting on this for now, but we likely do want to support this in the future (if for nothing else, I'd like to rip out `cuml.prims.labels` completely). I'll open a followup issue.

Authors:
  - Jim Crist-Harif (https://github.com/jcrist)

Approvers:
  - Victor Lafargue (https://github.com/viclafargue)

URL: rapidsai#7623
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants