Skip to content

Conversation

kaatish
Copy link
Contributor

@kaatish kaatish commented Sep 15, 2025

Description

This PR replaces bounds_presolve and multi_probe with their load balanced versions. Problems have a unique_ptr to their load balanced versions which are created when necessary.

Issue

Checklist

  • I am familiar with the Contributing Guidelines.
  • Testing
    • New or existing tests cover these changes
    • Added tests
    • Created an issue to follow-up
    • NA
  • Documentation
    • The documentation is up to date with these changes
    • Added new documentation
    • NA

Copy link

copy-pr-bot bot commented Sep 15, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@kaatish kaatish self-assigned this Sep 15, 2025
@kaatish kaatish added non-breaking Introduces a non-breaking change improvement Improves an existing functionality mip labels Sep 15, 2025
@kaatish kaatish added this to the 25.10 milestone Sep 15, 2025
@kaatish kaatish requested a review from akifcorduk September 15, 2025 12:47
@kaatish kaatish marked this pull request as ready for review September 15, 2025 13:56
@kaatish kaatish requested review from a team as code owners September 15, 2025 13:56
@kaatish
Copy link
Contributor Author

kaatish commented Sep 15, 2025

/ok to test 40b4dbe

@kaatish
Copy link
Contributor Author

kaatish commented Sep 15, 2025

/ok to test 2a84cc7

Copy link
Contributor

@akifcorduk akifcorduk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great work. IMO, we should remove all non-lb functions in this PR not to cause further conflicts or confusions.

I didn't review LB functions or kernels as they were mostly restored from previous PR.

@@ -42,9 +42,15 @@ set(MIP_NON_LP_FILES
${CMAKE_CURRENT_SOURCE_DIR}/presolve/bounds_presolve.cu
${CMAKE_CURRENT_SOURCE_DIR}/presolve/bounds_update_data.cu
${CMAKE_CURRENT_SOURCE_DIR}/presolve/conditional_bound_strengthening.cu
${CMAKE_CURRENT_SOURCE_DIR}/presolve/lb_bounds_presolve.cu
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand the structure: we are using lb versions throughout the code and want to phase-out the non-lb version, right? If that's the case, I would put the non-lb versions under legacy dir to be removed completely.

${CMAKE_CURRENT_SOURCE_DIR}/presolve/multi_probe.cu
${CMAKE_CURRENT_SOURCE_DIR}/presolve/probing_cache.cu
${CMAKE_CURRENT_SOURCE_DIR}/presolve/trivial_presolve.cu
${CMAKE_CURRENT_SOURCE_DIR}/problem/load_balanced_problem.cu
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit confusing to have lb_problem.cu and load_balanced_problem.cu

f_t curr_cstr_violation_down = max(0., cnst_lb - eps - max_act[cstr_idx]);
auto slack = cnst_slack[cstr_idx];
f_t curr_cstr_violation_up = max(0., -get_lower(slack) - eps);
f_t curr_cstr_violation_down = max(0., get_upper(slack) - eps);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't max_act - cnst_lb == upper_slack ?

f_t curr_cstr_violation_up = max(0., min_act[cstr_idx] - (cnst_ub + eps));
f_t curr_cstr_violation_down = max(0., cnst_lb - eps - max_act[cstr_idx]);
auto slack = cnst_slack[cstr_idx];
f_t curr_cstr_violation_up = max(0., -get_lower(slack) - eps);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the slack computation go negative or is there a min(0, slack) check? If it doesn't go negative this will not work. If it can go negative, we need to check other places if there is an implicit assumption that the slack is non-negative.

// f_t new_violations_up = max(0., shift_in_activities - (cnst_ub - min_act) - eps);
// f_t new_violations_down = max(0., (cnst_lb - max_act) - shift_in_activities - eps);
// becomes
f_t new_violations_up = max(0., shift_in_activities - get_lower(slack) - eps);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: This is valid given that slack calculation allows negative.

<< num_sub_warp_blocks << "\n";
#endif

// block_id_offsets.push_back(bin_offsets[std::log2(16 * 2 * w_t_r) + 3]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover

thrust::fill(rmm::exec_policy(stream), vertex_id.begin(), vertex_id.end(), i_t{-1});
thrust::fill(rmm::exec_policy(stream), pseudo_block_id.begin(), pseudo_block_id.end(), i_t{1});
//{
// std::cerr<<"\nitem_block_segments\n";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover

<<<pb.n_constraints, n_threads, 0, handle_ptr->get_stream()>>>(
pb.view(), upd_0.view(), upd_1.view());
{
nvtxRangePush("multi_act");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for scopes IMO it is more maintainable and readable to use RAII based raft::common::nvtx::range

@@ -941,6 +944,7 @@ void problem_t<i_t, f_t>::resize_variables(size_t size)
objective_coefficients.resize(size, handle_ptr->get_stream());
is_binary_variable.resize(size, handle_ptr->get_stream());
related_variables_offsets.resize(size, handle_ptr->get_stream());
lb_problem_invalidated = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would only mark invalidated if the sizes are different. Theremight be places, we call resize on the same problem and it acts as a no-op.

i_t beg =
block_id_offsets[seg] + (med_block_id - block_offsets[seg]) * (blockDim.x / threads_per_row);
i_t end = block_id_offsets[seg + 1];
// if (threadIdx.x == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some leftovers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improves an existing functionality mip non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants