Initial import of `cuda.core.system` #1393

mdboom · 2025-12-17T14:35:35Z

(Marked as draft as a reminder to not merge until after the 0.5.0 release...)

Prerequisites to get this PR to pass:

This is the first landing of cuda.core.system, with all of the features in the nvutil prototype (which sort of has an arbitrary collection of the most core things in NVML, but is a reasonable starting point for a first PR).

This requires a generator change (not yet merged) to include AUTO_LOWPP_* classes in the .pxd file so they can be cimport'ed. I know we don't usually do that, but it seems important to be able to use those high-level bindings and not repeat ourselves. ABI stability there should be ok -- I don't anticipate needing to change anything on the .pxd side of those classes.

Following the nvutil design, this initializes NVML immediately upon import of cuda.core.system. That feels convenient and may be the right choice, but it will be hard to walk that back. Questions the NVML docs don't answer for me: are there any use cases where you would want to init/shutdown NVML repeatedly. The cuda.bindings.nvml tests do this, so I know it works. Is there any harm in init'ing and never shutting down -- we could add an atexit handler, but I don't know if it's required.

copy-pr-bot · 2025-12-17T14:35:38Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

mdboom · 2025-12-17T14:37:46Z

cuda_core/cuda/core/system/system.pyx

+from cuda.bindings cimport _nvml as nvml
+
+
+def get_driver_version() -> tuple[int, int]:


This is a bit confusing. This is an existing API that returns the CUDA version. It really should be called get_cuda_version to avoid confusion, but that would be a breaking change. There is a new API to return the driver version called get_gpu_driver_version below, but that naming isn't great.

mdboom · 2025-12-17T14:47:13Z

/ok to test

leofang · 2025-12-17T15:00:11Z

cuda_core/cuda/core/system/__init__.py

FYI, cuda.core supports any cuda-bindings/cuda-python 12.x and 13.x, many of which do not have the NVML bindings available. So, we need a version guard here before importing anything that would expect the bindings to exist, and raise an exception in such cases.

Ah, good reminder. I guess that precludes cimport'ing anything from cuda.bindings._nvml, since _nvml is a moving target. Will just take that out for now...

mdboom · 2025-12-17T18:59:27Z

/ok to test

mdboom · 2025-12-17T19:37:09Z

/ok to test

mdboom · 2025-12-17T19:57:33Z

/ok to test

mdboom · 2025-12-17T20:49:46Z

/ok to test

mdboom · 2025-12-17T20:59:40Z

/ok to test

mdboom · 2025-12-17T22:31:13Z

/ok to test

Copilot

Pull request overview

This PR introduces the new cuda.core.system module that provides system-level GPU information via NVML (NVIDIA Management Library). It replaces the previous singleton System class with a more comprehensive module that offers both backward-compatible functions and new NVML-powered device management capabilities.

Key changes:

Replaces singleton System class with module-level functions (get_num_devices(), get_driver_version(), etc.)
Adds comprehensive Device class with NVML-backed properties for device information (architecture, memory, PCI info, etc.)
Implements automatic NVML initialization on module import with version-gated availability
Provides utility functions for formatting bytes and unpacking bitmasks

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
cuda_core/tests/test_memory.py	Updates API calls from `ccx_system.num_devices` property to `ccx_system.get_num_devices()` function
cuda_core/tests/system/test_system_utils.py	Adds comprehensive tests for utility functions (format_bytes, unpack_bitmask)
cuda_core/tests/system/test_system_system.py	Adds tests for system-level functions (driver versions, device count, process name)
cuda_core/tests/system/test_system_device.py	Adds extensive tests for Device class properties (architecture, memory, PCI info, etc.)
cuda_core/tests/system/test_nvml_context.py	Adds tests for NVML initialization state management across processes
cuda_core/tests/system/conftest.py	Defines NVML version requirements and skip marker for unsupported versions
cuda_core/tests/system/init.py	Empty init file for test module
cuda_core/cuda/core/experimental/system/utils.pyx	Implements utility functions for byte formatting and bitmask unpacking
cuda_core/cuda/core/experimental/system/system.pyx	Implements system-level query functions with NVML and fallback support
cuda_core/cuda/core/experimental/system/device.pyx	Implements Device class with comprehensive GPU properties via NVML
cuda_core/cuda/core/experimental/system/_nvml_context.pyx	Implements thread-safe, per-process NVML initialization logic
cuda_core/cuda/core/experimental/system/init.py	Module entry point with version-gated NVML imports and initialization
cuda_core/cuda/core/experimental/_system.py	Removes deprecated singleton System class
cuda_core/cuda/core/experimental/init.py	Updates imports to use new system module instead of System singleton
cuda_bindings/cuda/bindings/_nvml.pyx	Adds enums and fixes BAR1Memory property naming (breaking change)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

cuda_core/tests/system/test_system_device.py

cuda_core/cuda/core/experimental/system/__init__.py

cuda_core/cuda/core/experimental/system/device.pyx

cuda_core/cuda/core/experimental/system/utils.pyx

cuda_core/tests/system/conftest.py

cuda_bindings/cuda/bindings/_nvml.pyx

cuda_core/cuda/core/experimental/system/device.pyx

cuda_core/cuda/core/experimental/system/utils.pyx

copy-pr-bot · 2025-12-18T13:22:05Z

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

mdboom · 2025-12-18T14:22:25Z

/ok to test

github-actions · 2025-12-18T14:32:33Z

Doc Preview CI
🚀 View preview at https://nvidia.github.io/cuda-python/pr-preview/pr-1393/
https://nvidia.github.io/cuda-python/pr-preview/pr-1393/cuda-core/
https://nvidia.github.io/cuda-python/pr-preview/pr-1393/cuda-bindings/
https://nvidia.github.io/cuda-python/pr-preview/pr-1393/cuda-pathfinder/
Preview will be ready when the GitHub Pages deployment is complete.

mdboom commented Dec 17, 2025

View reviewed changes

leofang reviewed Dec 17, 2025

View reviewed changes

mdboom requested a review from Copilot December 17, 2025 22:40

Copilot started reviewing on behalf of mdboom December 17, 2025 22:41 View session

Copilot AI reviewed Dec 17, 2025

View reviewed changes

leofang added P0 High priority - Must do! feature New feature or request cuda.core Everything related to the cuda.core module labels Dec 18, 2025

leofang added this to the cuda.core backlog milestone Dec 18, 2025

leofang added the triage Needs the team's attention label Dec 18, 2025

mdboom mentioned this pull request Dec 18, 2025

Updates to cuda.bindings that are prereqs to cuda.core.system #1406

Open

mdboom force-pushed the cuda.core.system branch from f8f8e32 to 7719fc5 Compare December 18, 2025 13:21

mdboom marked this pull request as ready for review December 18, 2025 13:22

First pass at cuda.core.system

0cff9c7

mdboom force-pushed the cuda.core.system branch from 7719fc5 to 0cff9c7 Compare December 18, 2025 13:23

mdboom added 2 commits December 18, 2025 09:13

Fix fallback to non-working NVML

71ffb27

Precommit fixes

5cb0264

		from cuda.bindings cimport _nvml as nvml


		def get_driver_version() -> tuple[int, int]:

Initial import of cuda.core.system #1393

Are you sure you want to change the base?

Initial import of cuda.core.system #1393

Uh oh!

Conversation

mdboom commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot bot commented Dec 17, 2025

Uh oh!

mdboom Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

mdboom commented Dec 17, 2025

Uh oh!

leofang Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

mdboom Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

mdboom commented Dec 17, 2025

Uh oh!

mdboom commented Dec 17, 2025

Uh oh!

mdboom commented Dec 17, 2025

Uh oh!

mdboom commented Dec 17, 2025

Uh oh!

mdboom commented Dec 17, 2025

Uh oh!

mdboom commented Dec 17, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

copy-pr-bot bot commented Dec 18, 2025

Uh oh!

mdboom commented Dec 18, 2025

Uh oh!

github-actions bot commented Dec 18, 2025

Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Initial import of `cuda.core.system` #1393

Initial import of `cuda.core.system` #1393

mdboom commented Dec 17, 2025 •

edited

Loading