✨ feat: Support parallel computation for NDT by Tao-Sheng · Pull Request #6376 · PointCloudLibrary/pcl

Tao-Sheng · 2025-12-10T05:56:47Z

Support parallel computation for NDT by OpenMP with different neighbor search method

registration/include/pcl/registration/impl/ndt.hpp

Use signed std::ptrdiff_t instead of std::size_t

Copilot

Pull request overview

This PR adds OpenMP-based parallel computation support to the Normal Distributions Transform (NDT) registration algorithm, along with multiple neighbor search methods to optimize performance. The changes introduce thread-level parallelization to speed up the derivative computation phase of NDT.

Key changes:

Adds a NeighborSearchMethod enum with four options: KDTREE (original), DIRECT26, DIRECT7, and DIRECT1
Implements setNumberOfThreads() method with OpenMP support for parallel execution
Parallelizes the main computation loop in computeDerivatives() using OpenMP directives
Updates test to verify functionality with the new API

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
registration/include/pcl/registration/ndt.h	Adds enum for neighbor search methods, declares thread control API and private member variables for threading configuration
registration/include/pcl/registration/impl/ndt.hpp	Implements setNumberOfThreads() with OpenMP detection, parallelizes derivative computation loop, adds switch statement for different search methods
test/registration/test_ndt.cpp	Updates test to explicitly configure search method and thread count for validation

Comments suppressed due to low confidence (1)

registration/include/pcl/registration/impl/ndt.hpp:269

The OpenMP parallel region has a critical thread safety issue. The variables score, score_gradient, and hessian are marked as shared, but multiple threads will update them concurrently (via score += on line 266 and calls to updateDerivatives which modifies score_gradient and hessian). These concurrent updates to shared variables will cause race conditions, leading to incorrect results. You need to use reduction clauses for score and critical sections or atomic operations for score_gradient and hessian, or alternatively use thread-local accumulators that are combined after the parallel region.

#pragma omp parallel for num_threads(threads_) schedule(dynamic, 32)                   \
    shared(score, score_gradient, hessian) firstprivate(neighborhood, distances)
  // Update gradient and hessian for each point, line 17 in Algorithm 2 [Magnusson 2009]
  for (std::ptrdiff_t idx = 0; idx < static_cast<std::ptrdiff_t>(input_->size());
       idx++) {
    // Transformed Point
    const auto& x_trans_pt = trans_cloud[idx];

    // Find neighbors with different search method
    switch (search_method_) {
    case KDTREE:
      // Radius search has been experimentally faster than direct neighbor checking.
      target_cells_.radiusSearch(x_trans_pt, resolution_, neighborhood, distances);
      break;
    case DIRECT26:
      target_cells_.getNeighborhoodAtPoint(x_trans_pt, neighborhood);
      break;
    case DIRECT7:
      target_cells_.getFaceNeighborsAtPoint(x_trans_pt, neighborhood);
      break;
    case DIRECT1:
      target_cells_.getVoxelAtPoint(x_trans_pt, neighborhood);
      break;
    }

    // Original Point
    const Eigen::Vector3d x = (*input_)[idx].getVector3fMap().template cast<double>();
    const Eigen::Vector3d x_trans_position =
        x_trans_pt.getVector3fMap().template cast<double>();

    for (const auto& cell : neighborhood) {
      // Denorm point, x_k' in Equations 6.12 and 6.13 [Magnusson 2009]
      const Eigen::Vector3d x_trans = x_trans_position - cell->getMean();
      // Inverse Covariance of Occupied Voxel
      // Uses precomputed covariance for speed.
      const Eigen::Matrix3d c_inv = cell->getInverseCov();

      // Compute derivative of transform function w.r.t. transform vector, J_E and H_E
      // in Equations 6.18 and 6.20 [Magnusson 2009]
      computePointDerivatives(x);
      // Update score, gradient and hessian, lines 19-21 in Algorithm 2, according to
      // Equations 6.10, 6.12 and 6.13, respectively [Magnusson 2009]
      score +=
          updateDerivatives(score_gradient, hessian, x_trans, c_inv, compute_hessian);
    }
  }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

registration/include/pcl/registration/ndt.h

registration/include/pcl/registration/impl/ndt.hpp

Apply copilot suggestion to fix some error

mvieth

You have to at least put a omp critical region around computePointDerivatives(x) and updateDerivatives(...) because they both use global variables.
Also maybe put NeighborSearchMethod into the NormalDistributionTransform class, to make it clear that they belong together?

- Added the switch to select different search methods to computeHessian() as well (would not make sense to use different search methods in computeDerivatives() and computeHessian()) - Added overloads to functions computePointDerivatives(), updateDerivatives(), and updateHessian() to be able to use local variables instead of the global variables point_jacobian_ and point_hessian_ - Change the parallelized region in computeDerivatives() by using the new function overloads, and by introducing a custom reduction (goal: compatibility with OpenMP 2.0). Synchronization overhead between threads is minimal, with only two critical regions at the end to accumulate, score, gradient, and hessian - Tested with two larger point clouds, and there is indeed a nice speed-up when increasing the number of threads (however not a perfect 1-to-1 speedup because there are still other functions in NDT that are not parallelized)

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

test/registration/test_ndt.cpp

registration/include/pcl/registration/impl/ndt.hpp

registration/include/pcl/registration/ndt.h

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

registration/include/pcl/registration/ndt.h

registration/include/pcl/registration/impl/ndt.hpp

test/registration/test_ndt.cpp

registration/include/pcl/registration/impl/ndt.hpp

registration/include/pcl/registration/ndt.h

mvieth · 2026-03-15T09:52:27Z

I looked a bit more into this, and I realized that my earlier suggestion to put an omp critical region around computePointDerivatives(x) and updateDerivatives(...) would hinder parallelization a lot because a lot of computation time is spent in those two function. So instead I added function overloads to those two and updateHessian(...) to be able to use local variables (private per-thread) instead of the global class members point_jacobian_ and point_hessian_. I also created a reduction on score, score_gradient, and hessian (with a workaround for the last two to make this compatible with OpenMP 2.0), so now there is only minimal synchronization overhead at the end when the results from all threads are added together. I have tested with two larger point clouds (from the tutorials), and there is indeed a nice speed-up when increasing the number of threads (however not a perfect 1-to-1 speed-up because there are still other functions in NDT that are not parallelized). In a future PR, we could check which other functions of NDT could be parallelized sensibly.

I also added the switch to select different search methods to computeHessian() as well (would not make sense to use different search methods in computeDerivatives() and computeHessian()).

✨ feat: Support parallel computation for NDT

a1ab585

Support parallel computation for NDT by OpenMP with different neighbor search method

larshg reviewed Dec 10, 2025

View reviewed changes

registration/include/pcl/registration/impl/ndt.hpp Outdated Show resolved Hide resolved

🐞 fix: OpenMP Requires Signed Indices

bb01879

Use signed std::ptrdiff_t instead of std::size_t

larshg added module: registration changelog: enhancement Meta-information for changelog generation labels Dec 11, 2025

larshg added this to the pcl-1.16.0 milestone Dec 11, 2025

larshg requested review from Copilot and mvieth December 11, 2025 06:56

Copilot started reviewing on behalf of larshg December 11, 2025 06:57 View session

Copilot AI reviewed Dec 11, 2025

View reviewed changes

registration/include/pcl/registration/ndt.h Outdated Show resolved Hide resolved

registration/include/pcl/registration/ndt.h Outdated Show resolved Hide resolved

registration/include/pcl/registration/impl/ndt.hpp Outdated Show resolved Hide resolved

🐞 fix: Apply copilot suggestion

6c0acc8

Apply copilot suggestion to fix some error

mvieth reviewed Dec 22, 2025

View reviewed changes

mvieth added the needs: more work Specify why not closed/merged yet label Feb 8, 2026

mvieth requested a review from Copilot March 14, 2026 10:46

Copilot started reviewing on behalf of mvieth March 14, 2026 10:46 View session

mvieth force-pushed the feat/ndt_openmp branch from 50069ad to ed06792 Compare March 14, 2026 10:47

Copilot AI reviewed Mar 14, 2026

View reviewed changes

Fixes per copilot

c566c4d

mvieth requested a review from Copilot March 14, 2026 11:06

Copilot started reviewing on behalf of mvieth March 14, 2026 11:06 View session

Fix formatting

37d8d3e

Copilot AI reviewed Mar 14, 2026

View reviewed changes

Also add DIRECT27

5c88705

mvieth removed the needs: more work Specify why not closed/merged yet label Mar 15, 2026

mvieth approved these changes Mar 15, 2026

View reviewed changes

mvieth requested a review from larshg March 15, 2026 09:52

larshg approved these changes Mar 21, 2026

View reviewed changes

mvieth merged commit 8273409 into PointCloudLibrary:master Mar 21, 2026
13 checks passed

Uh oh!

Conversation

Tao-Sheng commented Dec 10, 2025

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mvieth left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mvieth commented Mar 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants