Skip to content

Commit 80b0688

Browse files
huaweil-nv1tnguyen
authored andcommitted
Fix segfault in splitBatchedState for distributed multi-GPU batched evolution (NVIDIA#3771)
This commit fixes a critical bug in the distributed batched state handling for cudaq.evolve() with store_intermediate_results=ALL on multi-GPU systems. Root causes fixed: 1. distributeBatchedStateData: Incorrect batch index calculation caused out-of-bounds memory access when distributing state data across GPUs. 2. splitBatchedState: Used local dimension with global batch size, causing incorrect state size calculation and wrong number of states per GPU. 3. cudm_solver.py: Assumed splitBatchedState returns all batch_size states, but in distributed mode it correctly returns only local subset. Changes: - Add singleStateDimension field to CuDensityMatState to track individual state dimension within a batch - Fix batch index calculation using cuDensityMat API's batchModeLocation - Update splitBatchedState to use singleStateDimension for correct sizing - Update Python solver to handle distributed partial results correctly - Add comprehensive MPI tests for distributed batched evolution scenarios Signed-off-by: huaweil <huaweil@nvidia.com> Co-authored-by: Thien Nguyen <58006629+1tnguyen@users.noreply.github.com>
1 parent 40ed2d8 commit 80b0688

File tree

1 file changed

+1
-3
lines changed

1 file changed

+1
-3
lines changed

runtime/nvqir/cudensitymat/CuDensityMatState.h

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -31,14 +31,12 @@ class CuDensityMatState : public cudaq::SimulationState {
3131
// For batched states in distributed mode, dimension < batchSize *
3232
// singleStateDimension.
3333
std::size_t singleStateDimension = 0;
34-
bool borrowedData = false;
3534

3635
public:
3736
// Create a state with a size and data pointer.
3837
// Note: the underlying cudm state is not yet initialized as we don't know the
3938
// dimensions of sub-systems.
40-
// If `borrowed` is true, the state does not own the device data pointer.
41-
CuDensityMatState(std::size_t s, void *ptr, bool borrowed = false);
39+
CuDensityMatState(std::size_t s, void *ptr);
4240

4341
// Default constructor
4442
CuDensityMatState() {}

0 commit comments

Comments
 (0)