Race batch PDLP and dual simplex in strong branching / reliability branching#994
Race batch PDLP and dual simplex in strong branching / reliability branching#994Kh4ster wants to merge 52 commits intorelease/26.04from
Conversation
…+ fail safe if batch pdlp fails
…te_lp_problem' into race_strong_branching_realibility_branching
…ng, correctly fill the ds_obj objective before merging results at the root, correctly clamp the PDLP objective, remove the unucessary cuopt_assert regarding fixed point error
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds cooperative Dual Simplex ⇄ Batch PDLP strong/reliability branching, new tri-state parameters/macros, shared strong-branching context types, PDLP warm-start / initial-iteration controls, coordinated concurrent termination and optional per-batch solution collection, plus API/signature updates across branching, PDLP, solver settings, tests, and build files. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cpp/src/branch_and_bound/pseudo_costs.cpp (1)
104-109:⚠️ Potential issue | 🔴 CriticalDon't derive early-stopped simplex bounds from
solution.x.For
ITERATION_LIMIT, dual simplex is still useful here because it preserves a valid lower bound, butcompute_objective(child_problem, solution.x)is not that bound. Feedingc^T xfrom a non-optimal iterate into strong-branch scores and pseudo-cost updates can bias branching decisions.Based on learnings, cuOpt dual simplex is dual-safe, so the safe lower bound at
ITERATION_LIMITis the solver’s maintained objective value, not an objective recomputed from the current primal iterate.Also applies to: 257-259
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 104 - 109, The code sets obj = compute_objective(child_problem, solution.x) when status == dual::status_t::ITERATION_LIMIT, which incorrectly uses a non‑optimal primal iterate; instead fetch and use the solver's maintained objective/lower bound for the iterate (the dual‑safe objective value the solver preserves) rather than recomputing c^T x from solution.x. Locate the status checks (dual::status_t::ITERATION_LIMIT) and replace the compute_objective(child_problem, solution.x) use with the solver‑reported objective/lower bound (e.g., the solution's maintained objective field or a solver API that returns the current bound) in both the block around compute_objective and the other occurrence that mirrors lines 257-259 so pseudo‑cost updates and strong‑branch scoring use the dual‑safe bound.
🧹 Nitpick comments (4)
cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp (1)
21-22: Consider moving the shared strong-branching view to a solver-neutral header.
pdlp_solver_settings_tnow pulls abranch_and_bound/header into the PDLP settings layer just to carry a coordination view. Relocating that type to a shared LP module would keep the dependency direction cleaner and make PDLP easier to reuse outside branch-and-bound.Based on learnings "Reduce tight coupling between solver components (presolve, simplex, basis, barrier); increase modularity and reusability of optimization algorithms"
Also applies to: 277-278
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp` around lines 21 - 22, The PDLP settings header currently includes branch_and_bound/shared_strong_branching_context.hpp to obtain the coordination view; instead move the shared strong-branching view type out of branch_and_bound into a solver-neutral header in the shared LP module (e.g., shared_lp/strong_branching_view.hpp) and update pdlp_solver_settings_t to include that new header rather than the branch_and_bound one; also update any other files that currently include shared_strong_branching_context.hpp (refs: pdlp_solver_settings_t and the other occurrences noted) to include the new solver-neutral header so PDLP no longer depends on branch_and_bound.cpp/src/branch_and_bound/shared_strong_branching_context.hpp (1)
37-42: Consider relaxed memory ordering for potential performance improvement.The current default
memory_order_seq_cstis correct but may have unnecessary overhead. Since thesolvedflag is only used as a hint (no other data depends on it), relaxed ordering would suffice:♻️ Optional: Use relaxed memory ordering
bool is_solved(i_t local_idx) const { - return solved[local_idx].load() != 0; + return solved[local_idx].load(std::memory_order_relaxed) != 0; } - void mark_solved(i_t local_idx) const { solved[local_idx].store(1); } + void mark_solved(i_t local_idx) const { solved[local_idx].store(1, std::memory_order_relaxed); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/branch_and_bound/shared_strong_branching_context.hpp` around lines 37 - 42, The solved atomic currently uses default seq_cst ordering in is_solved and mark_solved; change them to use relaxed ordering since solved[] is only a hint and no other data depends on it: update is_solved to call solved[local_idx].load(std::memory_order_relaxed) and update mark_solved to call solved[local_idx].store(1, std::memory_order_relaxed) (references: is_solved, mark_solved, solved).cpp/tests/linear_programming/pdlp_test.cu (2)
2310-2342: Same memory leak risk with raw pointer pattern.This test has the same raw
new/deletepattern asshared_sb_view_concurrent_mark. Apply the samestd::unique_ptrfix for exception safety.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tests/linear_programming/pdlp_test.cu` around lines 2310 - 2342, Replace the raw new/delete pattern with a std::unique_ptr to ensure exception safety: change result_ptr from a raw pointer to std::unique_ptr<optimization_problem_solution_t<int,double>> result_ptr; inside the pdlp_thread lambda assign result_ptr = std::make_unique<optimization_problem_solution_t<int,double>>(solve_lp(&handle_, op_problem, solver_settings)); remove the explicit delete result_ptr; keep the existing dereference (auto& solution = *result_ptr) and joins unchanged; this preserves semantics while preventing leaks.
2244-2278: Consider usingstd::unique_ptrto prevent memory leaks on exceptions.The raw
new/deletepattern risks leaking memory if an exception is thrown between allocation and cleanup. Additionally, the 200ms sleep-based synchronization may cause test flakiness under varying system loads.♻️ Suggested improvement using unique_ptr
- optimization_problem_solution_t<int, double>* result_ptr = nullptr; + std::unique_ptr<optimization_problem_solution_t<int, double>> result_ptr; auto pdlp_thread = std::thread([&]() { - auto sol = new optimization_problem_solution_t<int, double>( - solve_lp(&handle_, op_problem, solver_settings)); - result_ptr = sol; + result_ptr = std::make_unique<optimization_problem_solution_t<int, double>>( + solve_lp(&handle_, op_problem, solver_settings)); }); // ... pdlp_thread.join(); ASSERT_NE(result_ptr, nullptr); - auto& solution = *result_ptr; + auto& solution = *result_ptr; // ... assertions ... - delete result_ptr; + // unique_ptr handles cleanup automatically🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tests/linear_programming/pdlp_test.cu` around lines 2244 - 2278, Replace the raw new/delete and sleep-based sync: allocate the solution into a std::unique_ptr<optimization_problem_solution_t<int,double>> and transfer it out of the thread via a std::promise<std::unique_ptr<...>>/std::future to avoid leaks if exceptions occur (replace result_ptr and the new in the lambda with promise.set_value(std::move(ptr))). Replace the fixed std::this_thread::sleep_for with a deterministic synchronization that waits for the driver simulation to mark entries solved (e.g., use a condition_variable or wait on the atomic ctx.solved flags with a short timeout loop) so the test does not rely on timing; update references to result_ptr to use the future’s returned unique_ptr and remove manual delete.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/src/branch_and_bound/pseudo_costs.cpp`:
- Around line 517-520: The debug block refers to a non-existent identifier
original_solution; replace it with the correct variable that holds the solve
result in this scope (the one that exposes
get_pdlp_warm_start_data().total_pdlp_iterations_), e.g., use the actual
solution variable name used elsewhere in this function (search for the variable
that calls get_pdlp_warm_start_data()) so the BATCH_VERBOSE_MODE print uses a
valid identifier and compiles.
- Around line 949-957: The reliability-branching path allows batch PDLP when it
should be disabled for nested sub-MIP runs; update the use_pdlp guard to also
check settings.sub_mip is false before starting/using batch PDLP and the
pdlp_warm_cache (i.e., include a check against settings.sub_mip in the boolean
expression that defines use_pdlp so batch PDLP/warm-cache reuse is skipped when
settings.sub_mip is true in the reliability-branching code that computes
use_pdlp/pdlp_warm_cache).
In `@cpp/src/branch_and_bound/pseudo_costs.hpp`:
- Around line 411-420: The PDLP warm-start cache
(batch_pdlp_warm_cache_t::populated and its device_uvector/scalar fields) is
never invalidated, causing stale warm-start data to persist across different
solves; add a reset() method on batch_pdlp_warm_cache_t that clears or
reinitializes initial_primal and initial_dual (release or resize to 0 on the
RAFT/RMM stream), sets populated=false, and resets
step_size/primal_weight/pdlp_iteration/pourcent_solved_by_batch_pdlp_at_root to
NaN/initial values, then ensure that pdlp_warm_cache.reset() is invoked from the
solver’s solve-start path (e.g., inside pseudo_costs_t::reset or the method that
invalidates LP/cut context) so the cache is cleared whenever the LP/cut
configuration changes.
In `@cpp/src/math_optimization/solver_settings.cu`:
- Around line 102-103: The registry metadata entries for
CUOPT_MIP_BATCH_PDLP_STRONG_BRANCHING and
CUOPT_MIP_BATCH_PDLP_RELIABILITY_BRANCHING advertise a default of 0 while
mip_solver_settings_t now default-constructs those knobs to 1; update the
metadata in solver_settings.cu so the default values for the entries referencing
mip_settings.mip_batch_pdlp_strong_branching and
mip_settings.mip_batch_pdlp_reliability_branching are changed from 0 to 1 to
match the constructor defaults and ensure reset/metadata reflect the true
initial state.
In `@cpp/src/pdlp/solver_settings.cu`:
- Around line 351-355: The setter
pdlp_solver_settings_t<i_t,f_t>::set_initial_pdlp_iteration currently accepts
any value; add a validation that rejects negative iterations by checking if
initial_pdlp_iteration < 0 and throwing a std::invalid_argument (with a clear
message) when negative, otherwise assign initial_pdlp_iteration_ =
std::make_optional(initial_pdlp_iteration); ensure the check uses a signed
comparison or constrained type handling so it behaves correctly for signed i_t.
In `@cpp/src/pdlp/termination_strategy/termination_strategy.cu`:
- Around line 127-132: The setter pdlp_termination_strategy_t<i_t,
f_t>::set_termination_status writes termination_status_[id] without validating
id; add an index bounds guard (e.g., compare id against
termination_status_.size() or a stored max_id) before the write and handle
out-of-range cases (assert/throw/log-and-return) to avoid corrupting state when
id is invalid, using the existing parameter names (id, status) and the member
termination_status_.
In
`@python/cuopt_server/cuopt_server/utils/linear_programming/data_definition.py`:
- Around line 454-463: The defaults for mip_batch_pdlp_strong_branching and
mip_batch_pdlp_reliability_branching currently use default=0 which disables
cooperative/racing branching; change their defaults to enable cooperative
behavior by default (set default=1) so server requests that omit these fields
use the intended cooperative/batch PDLP mode, and adjust any related
docstring/description text if needed to reflect the new default.
---
Outside diff comments:
In `@cpp/src/branch_and_bound/pseudo_costs.cpp`:
- Around line 104-109: The code sets obj = compute_objective(child_problem,
solution.x) when status == dual::status_t::ITERATION_LIMIT, which incorrectly
uses a non‑optimal primal iterate; instead fetch and use the solver's maintained
objective/lower bound for the iterate (the dual‑safe objective value the solver
preserves) rather than recomputing c^T x from solution.x. Locate the status
checks (dual::status_t::ITERATION_LIMIT) and replace the
compute_objective(child_problem, solution.x) use with the solver‑reported
objective/lower bound (e.g., the solution's maintained objective field or a
solver API that returns the current bound) in both the block around
compute_objective and the other occurrence that mirrors lines 257-259 so
pseudo‑cost updates and strong‑branch scoring use the dual‑safe bound.
---
Nitpick comments:
In `@cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp`:
- Around line 21-22: The PDLP settings header currently includes
branch_and_bound/shared_strong_branching_context.hpp to obtain the coordination
view; instead move the shared strong-branching view type out of branch_and_bound
into a solver-neutral header in the shared LP module (e.g.,
shared_lp/strong_branching_view.hpp) and update pdlp_solver_settings_t to
include that new header rather than the branch_and_bound one; also update any
other files that currently include shared_strong_branching_context.hpp (refs:
pdlp_solver_settings_t and the other occurrences noted) to include the new
solver-neutral header so PDLP no longer depends on branch_and_bound.
In `@cpp/src/branch_and_bound/shared_strong_branching_context.hpp`:
- Around line 37-42: The solved atomic currently uses default seq_cst ordering
in is_solved and mark_solved; change them to use relaxed ordering since solved[]
is only a hint and no other data depends on it: update is_solved to call
solved[local_idx].load(std::memory_order_relaxed) and update mark_solved to call
solved[local_idx].store(1, std::memory_order_relaxed) (references: is_solved,
mark_solved, solved).
In `@cpp/tests/linear_programming/pdlp_test.cu`:
- Around line 2310-2342: Replace the raw new/delete pattern with a
std::unique_ptr to ensure exception safety: change result_ptr from a raw pointer
to std::unique_ptr<optimization_problem_solution_t<int,double>> result_ptr;
inside the pdlp_thread lambda assign result_ptr =
std::make_unique<optimization_problem_solution_t<int,double>>(solve_lp(&handle_,
op_problem, solver_settings)); remove the explicit delete result_ptr; keep the
existing dereference (auto& solution = *result_ptr) and joins unchanged; this
preserves semantics while preventing leaks.
- Around line 2244-2278: Replace the raw new/delete and sleep-based sync:
allocate the solution into a
std::unique_ptr<optimization_problem_solution_t<int,double>> and transfer it out
of the thread via a std::promise<std::unique_ptr<...>>/std::future to avoid
leaks if exceptions occur (replace result_ptr and the new in the lambda with
promise.set_value(std::move(ptr))). Replace the fixed
std::this_thread::sleep_for with a deterministic synchronization that waits for
the driver simulation to mark entries solved (e.g., use a condition_variable or
wait on the atomic ctx.solved flags with a short timeout loop) so the test does
not rely on timing; update references to result_ptr to use the future’s returned
unique_ptr and remove manual delete.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8007b7e7-5e76-42d9-b05d-cd681c53cf23
📒 Files selected for processing (20)
cpp/include/cuopt/linear_programming/constants.hcpp/include/cuopt/linear_programming/mip/solver_settings.hppcpp/include/cuopt/linear_programming/pdlp/solver_settings.hppcpp/src/branch_and_bound/branch_and_bound.cppcpp/src/branch_and_bound/pseudo_costs.cppcpp/src/branch_and_bound/pseudo_costs.hppcpp/src/branch_and_bound/shared_strong_branching_context.hppcpp/src/dual_simplex/simplex_solver_settings.hppcpp/src/math_optimization/solver_settings.cucpp/src/mip_heuristics/CMakeLists.txtcpp/src/mip_heuristics/solver.cucpp/src/pdlp/pdlp.cucpp/src/pdlp/restart_strategy/pdlp_restart_strategy.cucpp/src/pdlp/solve.cucpp/src/pdlp/solver_settings.cucpp/src/pdlp/termination_strategy/termination_strategy.cucpp/src/pdlp/termination_strategy/termination_strategy.hppcpp/src/pdlp/translate.hppcpp/tests/linear_programming/pdlp_test.cupython/cuopt_server/cuopt_server/utils/linear_programming/data_definition.py
| template <typename i_t, typename f_t> | ||
| struct batch_pdlp_warm_cache_t { | ||
| const raft::handle_t batch_pdlp_handle{}; | ||
| rmm::device_uvector<f_t> initial_primal{0, batch_pdlp_handle.get_stream()}; | ||
| rmm::device_uvector<f_t> initial_dual{0, batch_pdlp_handle.get_stream()}; | ||
| f_t step_size{std::numeric_limits<f_t>::signaling_NaN()}; | ||
| f_t primal_weight{std::numeric_limits<f_t>::signaling_NaN()}; | ||
| i_t pdlp_iteration{-1}; | ||
| f_t pourcent_solved_by_batch_pdlp_at_root{f_t(0.0)}; | ||
| bool populated{false}; |
There was a problem hiding this comment.
Invalidate the PDLP warm-start cache when the LP context changes.
pdlp_warm_cache becomes sticky once populated is set, but this new persistent state has no reset/invalidation path. Reusing the same pseudo_costs_t for a later solve or a different cut/slack configuration can replay stale device vectors and warm-start scalars into PDLP.
Suggested direction
template <typename i_t, typename f_t>
struct batch_pdlp_warm_cache_t {
const raft::handle_t batch_pdlp_handle{};
rmm::device_uvector<f_t> initial_primal{0, batch_pdlp_handle.get_stream()};
rmm::device_uvector<f_t> initial_dual{0, batch_pdlp_handle.get_stream()};
f_t step_size{std::numeric_limits<f_t>::signaling_NaN()};
f_t primal_weight{std::numeric_limits<f_t>::signaling_NaN()};
i_t pdlp_iteration{-1};
f_t pourcent_solved_by_batch_pdlp_at_root{f_t(0.0)};
bool populated{false};
+
+ void reset()
+ {
+ initial_primal.resize(0, batch_pdlp_handle.get_stream());
+ initial_dual.resize(0, batch_pdlp_handle.get_stream());
+ step_size = std::numeric_limits<f_t>::signaling_NaN();
+ primal_weight = std::numeric_limits<f_t>::signaling_NaN();
+ pdlp_iteration = -1;
+ pourcent_solved_by_batch_pdlp_at_root = f_t(0.0);
+ populated = false;
+ }
};Call reset() from the solve-start path that invalidates the LP/cut context.
Also applies to: 535-535
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/branch_and_bound/pseudo_costs.hpp` around lines 411 - 420, The PDLP
warm-start cache (batch_pdlp_warm_cache_t::populated and its
device_uvector/scalar fields) is never invalidated, causing stale warm-start
data to persist across different solves; add a reset() method on
batch_pdlp_warm_cache_t that clears or reinitializes initial_primal and
initial_dual (release or resize to 0 on the RAFT/RMM stream), sets
populated=false, and resets
step_size/primal_weight/pdlp_iteration/pourcent_solved_by_batch_pdlp_at_root to
NaN/initial values, then ensure that pdlp_warm_cache.reset() is invoked from the
solver’s solve-start path (e.g., inside pseudo_costs_t::reset or the method that
invalidates LP/cut context) so the cache is cleared whenever the LP/cut
configuration changes.
python/cuopt_server/cuopt_server/utils/linear_programming/data_definition.py
Show resolved
Hide resolved
|
I think this targets the next 26.06 release. @chris-maes @Kh4ster for advise. |
|
/ok to test f456c71 |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cpp/src/pdlp/solve.cu (1)
843-896:⚠️ Potential issue | 🟠 MajorHonor
generate_batch_primal_dual_solutionin the memory estimate.The estimator still budgets three full batch solution buffers even when solution collection is disabled. That erases the intended memory win and can force needless sub-batching or a spurious
NumericalErroron workloads that would actually fit.As per coding guidelines, "Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems".Suggested fix
-static double batch_pdlp_memory_estimator(const optimization_problem_t<i_t, f_t>& problem, - double trial_batch_size) +static double batch_pdlp_memory_estimator(const optimization_problem_t<i_t, f_t>& problem, + double trial_batch_size, + bool include_solution_buffers) { double total_memory = 0.0; @@ - total_memory += problem.get_n_variables() * trial_batch_size * sizeof(f_t); - total_memory += problem.get_n_constraints() * trial_batch_size * sizeof(f_t); - total_memory += problem.get_n_variables() * trial_batch_size * sizeof(f_t); + if (include_solution_buffers) { + total_memory += problem.get_n_variables() * trial_batch_size * sizeof(f_t); + total_memory += problem.get_n_constraints() * trial_batch_size * sizeof(f_t); + total_memory += problem.get_n_variables() * trial_batch_size * sizeof(f_t); + } @@ - const double memory_estimate = batch_pdlp_memory_estimator(problem, max_batch_size); + const double memory_estimate = batch_pdlp_memory_estimator( + problem, max_batch_size, settings.generate_batch_primal_dual_solution); @@ - const double memory_estimate = batch_pdlp_memory_estimator(problem, memory_max_batch_size); + const double memory_estimate = batch_pdlp_memory_estimator( + problem, memory_max_batch_size, settings.generate_batch_primal_dual_solution); @@ - const double min_estimate = batch_pdlp_memory_estimator(problem, memory_max_batch_size); + const double min_estimate = batch_pdlp_memory_estimator( + problem, memory_max_batch_size, settings.generate_batch_primal_dual_solution);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/pdlp/solve.cu` around lines 843 - 896, The memory estimator batch_pdlp_memory_estimator currently always allocates three full batch solution buffers; modify it to check the generate_batch_primal_dual_solution flag (or equivalent configuration) and only include the per-batch solution buffers (the three additions near "Data for the solution" that use problem.get_n_variables() * trial_batch_size and problem.get_n_constraints() * trial_batch_size) when solution collection is enabled. Update the logic in batch_pdlp_memory_estimator to conditionally add those solution-related memory contributions based on generate_batch_primal_dual_solution (or the solver option used to control solution generation) so the estimator reflects the true memory needs when solution collection is disabled.
♻️ Duplicate comments (1)
cpp/src/branch_and_bound/pseudo_costs.cpp (1)
994-997:⚠️ Potential issue | 🟠 MajorMirror the sub-MIP disable guard in reliability branching.
Root strong branching disables batch PDLP when
settings.sub_mipis true (line 445), but this reliability branching path doesn't include the same guard. Nested sub-MIP reliability branching can still start the batch PDLP thread and reuse the warm cache, which the PR description indicates should be disabled.Suggested fix
- const bool use_pdlp = (rb_mode != 0) && pdlp_warm_cache.populated && + const bool use_pdlp = !settings.sub_mip && + (rb_mode != 0) && pdlp_warm_cache.populated && unreliable_list.size() > min_num_candidates_for_pdlp && pdlp_warm_cache.pourcent_solved_by_batch_pdlp_at_root > min_pourcent_solved_by_batch_pdlp_at_root_for_pdlp;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 994 - 997, The reliability-branching condition that computes use_pdlp should mirror the root strong-branching guard by disabling batch PDLP when settings.sub_mip is true; update the boolean expression that defines use_pdlp (currently using rb_mode, pdlp_warm_cache, unreliable_list, min_num_candidates_for_pdlp, pdlp_warm_cache.pourcent_solved_by_batch_pdlp_at_root, and min_pourcent_solved_by_batch_pdlp_at_root_for_pdlp) to include a !settings.sub_mip check so nested sub-MIP reliability branching cannot start or reuse the batch PDLP warm cache.
🧹 Nitpick comments (4)
cpp/src/branch_and_bound/pseudo_costs.cpp (4)
478-481: Track the commented-out TODO for dual solution uncrushing.The
uncrush_dual_solutioncall is commented out with a TODO referencing "Chris". This may affect warm-start quality if dual solution components (root_soln_y,root_soln_z) are needed for PDLP initialization.Consider adding a tracking issue or ensuring this is addressed before the PR exits draft status.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 478 - 481, The call to uncrush_dual_solution(...) is currently commented out, risking degraded warm-starts because root_soln_y/root_soln_z (and original_root_soln_y/original_root_soln_z) won't be restored; either re-enable the call in the PDLP initialization path (using original_problem, original_lp, root_soln_y, root_soln_z, original_root_soln_y, original_root_soln_z) or add a tracked TODO/issue referencing this exact block so it must be resolved before leaving draft status; ensure any enabling is guarded by a feature flag or a clear comment linking the issue/owner (Chris) so reviewers can verify the follow-up.
1167-1168: Consider usingcandidate_vars[i]consistently.The taskloop uses
unreliable_list[i]while the post-PDLP merge loop usescandidate_vars[i]. Sincecandidate_varsis defined as the firstnum_candidateselements ofunreliable_list, they're functionally equivalent, but usingcandidate_vars[i]consistently would improve readability and reduce the risk of bugs if the logic changes.- for (i_t i = 0; i < num_candidates; ++i) { - const i_t j = unreliable_list[i]; + for (i_t i = 0; i < num_candidates; ++i) { + const i_t j = candidate_vars[i];🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 1167 - 1168, The loop currently iterates using unreliable_list[i] while later code uses candidate_vars[i]; since candidate_vars is defined as the first num_candidates of unreliable_list, change the loop to index candidate_vars[i] for consistency and readability: replace uses of unreliable_list[i] inside the for-loop (and any related variables derived there) with candidate_vars[i] so the same symbol is used everywhere (e.g., in the taskloop and the post-PDLP merge loop) to avoid future mismatches between unreliable_list and candidate_vars.
366-374: Consider epsilon comparison for sigma value checks.The code uses exact floating-point equality (
sigma == -1,sigma == 1) to determine constraint bound computation. While sigma values should be exactly ±1 from matrix construction, exact float comparisons can fail silently if numerical perturbations occur.More critically, if
sigmais neither exactly 1 nor exactly -1, theelsebranch triggers an assert in debug mode but leavesconstraint_lower[i]andconstraint_upper[i]uninitialized in release builds.Suggested defensive fix
- if (sigma == -1) { + if (sigma < 0) { // sigma == -1 constraint_lower[i] = slack_lower + lp.rhs[i]; constraint_upper[i] = slack_upper + lp.rhs[i]; - } else if (sigma == 1) { + } else { // sigma == +1 constraint_lower[i] = -slack_upper + lp.rhs[i]; constraint_upper[i] = -slack_lower + lp.rhs[i]; - } else { - assert(sigma == 1.0 || sigma == -1.0); } + assert(std::abs(std::abs(sigma) - 1.0) < 1e-9 && "sigma must be ±1");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 366 - 374, The code currently compares sigma to exact values (-1 and 1) and leaves constraint_lower[i] and constraint_upper[i] uninitialized if sigma is slightly off, so change the branch to use an epsilon check (e.g., fabs(sigma - 1.0) < eps and fabs(sigma + 1.0) < eps) to detect ±1; compute bounds using the existing formulas when matched; otherwise set constraint_lower[i] and constraint_upper[i] to a safe fallback (for example compute using sign = (sigma >= 0 ? 1 : -1) or clamp sigma to its sign) and still trigger the assert/log for unexpected values so release builds get initialized while debug builds still catch the anomaly; reference symbols to update: sigma, constraint_lower, constraint_upper, slack_lower, slack_upper, lp.rhs in pseudo_costs.cpp.
1272-1287: Clean up or track commented-out debug logging.This block of commented-out logging code with "TODO put back" should either be removed, converted to a proper TODO with issue tracking, or re-enabled. Leaving large commented blocks reduces code readability.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 1272 - 1287, This large commented-out logging block around rb_mode and log.printf (referencing num_candidates, ds_optimal, ds_infeasible, ds_failed, ds_skipped, ds_elapsed) should not remain in place as-is: either remove it entirely, or restore it by uncommenting and fixing formatting so the if (rb_mode != 2) { ... } branch compiles and uses the existing variables, or replace the block with a single concise TODO comment that references a tracker/issue ID; pick one approach and apply it consistently so the file no longer contains the large commented-out printf block.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp`:
- Around line 287-292: The default for generate_batch_primal_dual_solution is
currently false which breaks existing batch LP callers that expect
primal/dual/reduced-cost buffers; change the default for the
generate_batch_primal_dual_solution member in solver_settings.hpp back to true
so batch solves continue to populate those buffers by default, and update the
inline comment to reflect that the default preserves previous behavior (callers
that want to opt-out can still set generate_batch_primal_dual_solution=false or
use the existing strong/reliability-branching call sites to disable collection).
In `@cpp/src/branch_and_bound/pseudo_costs.cpp`:
- Around line 158-172: Fix the typo in the comment above the DS check: change
"iteartion" to "iteration" in the comment that precedes the conditional where
sb_view.is_valid() and the ds_is_valid_done checks (the block that calls
sb_view.mark_solved and logs via settings.log.printf), so the comment reads "We
could not mark as solved nodes hitting iteration limit in DS".
In `@cpp/src/branch_and_bound/shared_strong_branching_context.hpp`:
- Around line 37-43: Add runtime bounds validation before any access to the
shared coordination state: ensure that settings.shared_sb_view.solved.size() (or
the span size exposed by shared_strong_branching_context_view_t) is >=
max_batch_size (where max_batch_size = settings.new_bounds.size()) before
entering run_batch_pdlp()'s batch loop; also add guard checks in the methods
is_solved(i_t), mark_solved(i_t) and subview(offset,count) to validate their
index/offset+count against the underlying solved span size and handle/report
invalid accesses (e.g., return false/log/throw) to prevent out-of-bounds access
and corrupted coordination state.
In `@cpp/src/pdlp/solve.cu`:
- Around line 961-965: The code allows settings.sub_batch_size to
unconditionally overwrite optimal_batch_size and can exceed the memory-checked
ceiling (memory_max_batch_size), risking GPU overcommit; update the logic around
optimal_batch_size (used with detail::optimal_batch_size_handler and compared to
max_batch_size) to revalidate or clamp settings.sub_batch_size against the
computed ceiling (memory_max_batch_size or max_batch_size) before assigning, and
adjust the subsequent cuopt_assert to reflect the validated/clamped value so an
oversized manual override cannot bypass the memory check.
In `@cpp/tests/linear_programming/pdlp_test.cu`:
- Around line 2244-2255: The test uses a fixed sleep
(std::this_thread::sleep_for(... 200ms)) to wait for PDLP to start, which is
flaky; replace this timing-based wait with deterministic synchronization: add a
latch/condition (e.g., std::promise/std::future or std::condition_variable +
mutex) or a std::barrier that the PDLP thread signals when it has reached the
point where external marks should be applied, then have the main test wait on
that synchronization primitive before setting ctx.solved[i].store(1); update the
lambda that creates pdlp_thread (pdlp_thread, solve_lp, and the PDLP entry
point) to signal the latch at the proper startup point so
shared_sb_view_concurrent_mark and shared_sb_view_all_infeasible
deterministically exercise the ConcurrentLimit race path.
---
Outside diff comments:
In `@cpp/src/pdlp/solve.cu`:
- Around line 843-896: The memory estimator batch_pdlp_memory_estimator
currently always allocates three full batch solution buffers; modify it to check
the generate_batch_primal_dual_solution flag (or equivalent configuration) and
only include the per-batch solution buffers (the three additions near "Data for
the solution" that use problem.get_n_variables() * trial_batch_size and
problem.get_n_constraints() * trial_batch_size) when solution collection is
enabled. Update the logic in batch_pdlp_memory_estimator to conditionally add
those solution-related memory contributions based on
generate_batch_primal_dual_solution (or the solver option used to control
solution generation) so the estimator reflects the true memory needs when
solution collection is disabled.
---
Duplicate comments:
In `@cpp/src/branch_and_bound/pseudo_costs.cpp`:
- Around line 994-997: The reliability-branching condition that computes
use_pdlp should mirror the root strong-branching guard by disabling batch PDLP
when settings.sub_mip is true; update the boolean expression that defines
use_pdlp (currently using rb_mode, pdlp_warm_cache, unreliable_list,
min_num_candidates_for_pdlp,
pdlp_warm_cache.pourcent_solved_by_batch_pdlp_at_root, and
min_pourcent_solved_by_batch_pdlp_at_root_for_pdlp) to include a
!settings.sub_mip check so nested sub-MIP reliability branching cannot start or
reuse the batch PDLP warm cache.
---
Nitpick comments:
In `@cpp/src/branch_and_bound/pseudo_costs.cpp`:
- Around line 478-481: The call to uncrush_dual_solution(...) is currently
commented out, risking degraded warm-starts because root_soln_y/root_soln_z (and
original_root_soln_y/original_root_soln_z) won't be restored; either re-enable
the call in the PDLP initialization path (using original_problem, original_lp,
root_soln_y, root_soln_z, original_root_soln_y, original_root_soln_z) or add a
tracked TODO/issue referencing this exact block so it must be resolved before
leaving draft status; ensure any enabling is guarded by a feature flag or a
clear comment linking the issue/owner (Chris) so reviewers can verify the
follow-up.
- Around line 1167-1168: The loop currently iterates using unreliable_list[i]
while later code uses candidate_vars[i]; since candidate_vars is defined as the
first num_candidates of unreliable_list, change the loop to index
candidate_vars[i] for consistency and readability: replace uses of
unreliable_list[i] inside the for-loop (and any related variables derived there)
with candidate_vars[i] so the same symbol is used everywhere (e.g., in the
taskloop and the post-PDLP merge loop) to avoid future mismatches between
unreliable_list and candidate_vars.
- Around line 366-374: The code currently compares sigma to exact values (-1 and
1) and leaves constraint_lower[i] and constraint_upper[i] uninitialized if sigma
is slightly off, so change the branch to use an epsilon check (e.g., fabs(sigma
- 1.0) < eps and fabs(sigma + 1.0) < eps) to detect ±1; compute bounds using the
existing formulas when matched; otherwise set constraint_lower[i] and
constraint_upper[i] to a safe fallback (for example compute using sign = (sigma
>= 0 ? 1 : -1) or clamp sigma to its sign) and still trigger the assert/log for
unexpected values so release builds get initialized while debug builds still
catch the anomaly; reference symbols to update: sigma, constraint_lower,
constraint_upper, slack_lower, slack_upper, lp.rhs in pseudo_costs.cpp.
- Around line 1272-1287: This large commented-out logging block around rb_mode
and log.printf (referencing num_candidates, ds_optimal, ds_infeasible,
ds_failed, ds_skipped, ds_elapsed) should not remain in place as-is: either
remove it entirely, or restore it by uncommenting and fixing formatting so the
if (rb_mode != 2) { ... } branch compiles and uses the existing variables, or
replace the block with a single concise TODO comment that references a
tracker/issue ID; pick one approach and apply it consistently so the file no
longer contains the large commented-out printf block.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 1a11857e-6bdc-4a3c-a045-386b504556d7
📒 Files selected for processing (15)
cpp/include/cuopt/linear_programming/constants.hcpp/include/cuopt/linear_programming/mip/solver_settings.hppcpp/include/cuopt/linear_programming/pdlp/solver_settings.hppcpp/src/branch_and_bound/pseudo_costs.cppcpp/src/branch_and_bound/shared_strong_branching_context.hppcpp/src/dual_simplex/simplex_solver_settings.hppcpp/src/math_optimization/solver_settings.cucpp/src/pdlp/pdlp.cucpp/src/pdlp/pdlp_constants.hppcpp/src/pdlp/solve.cucpp/src/pdlp/solver_settings.cucpp/src/pdlp/termination_strategy/termination_strategy.cucpp/src/pdlp/utilities/ping_pong_graph.cucpp/tests/linear_programming/pdlp_test.cupython/cuopt_server/cuopt_server/utils/linear_programming/data_definition.py
💤 Files with no reviewable changes (1)
- cpp/src/pdlp/pdlp_constants.hpp
✅ Files skipped from review due to trivial changes (1)
- cpp/src/pdlp/utilities/ping_pong_graph.cu
🚧 Files skipped from review as they are similar to previous changes (6)
- cpp/include/cuopt/linear_programming/constants.h
- cpp/src/dual_simplex/simplex_solver_settings.hpp
- cpp/src/math_optimization/solver_settings.cu
- python/cuopt_server/cuopt_server/utils/linear_programming/data_definition.py
- cpp/src/pdlp/solver_settings.cu
- cpp/src/pdlp/pdlp.cu
| // By default to save memory and speed we don't store and copy each climber's primal and dual | ||
| // solutions We only retrieve termination statistics and the objective values | ||
| bool generate_batch_primal_dual_solution{false}; | ||
| // Used to force batch PDLP to solve a subbatch of the problems at a time | ||
| // The 0 default value will make the solver use its heuristic to determine the subbatch size | ||
| i_t sub_batch_size{0}; |
There was a problem hiding this comment.
Keep batch solution generation opt-out, not opt-in.
With this default at false, generic batch LP solves now return empty primal/dual/reduced-cost buffers unless every caller remembers to opt in. Existing batch tests in cpp/tests/linear_programming/pdlp_test.cu still slice those buffers after solve_lp(..., new_bounds ...), so this silently regresses the non-MIP batch path and can drive helpers like extract_subvector() past a zero-length buffer. The strong/reliability-branching call sites can disable collection explicitly, but the default here needs to preserve the old behavior.
Suggested fix
- bool generate_batch_primal_dual_solution{false};
+ bool generate_batch_primal_dual_solution{true};📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // By default to save memory and speed we don't store and copy each climber's primal and dual | |
| // solutions We only retrieve termination statistics and the objective values | |
| bool generate_batch_primal_dual_solution{false}; | |
| // Used to force batch PDLP to solve a subbatch of the problems at a time | |
| // The 0 default value will make the solver use its heuristic to determine the subbatch size | |
| i_t sub_batch_size{0}; | |
| // By default to save memory and speed we don't store and copy each climber's primal and dual | |
| // solutions We only retrieve termination statistics and the objective values | |
| bool generate_batch_primal_dual_solution{true}; | |
| // Used to force batch PDLP to solve a subbatch of the problems at a time | |
| // The 0 default value will make the solver use its heuristic to determine the subbatch size | |
| i_t sub_batch_size{0}; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp` around lines
287 - 292, The default for generate_batch_primal_dual_solution is currently
false which breaks existing batch LP callers that expect
primal/dual/reduced-cost buffers; change the default for the
generate_batch_primal_dual_solution member in solver_settings.hpp back to true
so batch solves continue to populate those buffers by default, and update the
inline comment to reflect that the default preserves previous behavior (callers
that want to opt-out can still set generate_batch_primal_dual_solution=false or
use the existing strong/reliability-branching call sites to disable collection).
| bool is_solved(i_t local_idx) const { return solved[local_idx].load() != 0; } | ||
|
|
||
| void mark_solved(i_t local_idx) const { solved[local_idx].store(1); } | ||
|
|
||
| shared_strong_branching_context_view_t subview(i_t offset, i_t count) const | ||
| { | ||
| return {solved.subspan(offset, count)}; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate and examine the header file in question
find . -name "shared_strong_branching_context.hpp" -type fRepository: NVIDIA/cuopt
Length of output: 118
🏁 Script executed:
# Examine the full context of the shared_strong_branching_context.hpp file
cat -n cpp/src/branch_and_bound/shared_strong_branching_context.hppRepository: NVIDIA/cuopt
Length of output: 1636
🏁 Script executed:
# Search for usages of subview, is_solved, and mark_solved in solve.cu
rg "subview|is_solved|mark_solved" cpp/src/pdlp/solve.cu -A 3 -B 3Repository: NVIDIA/cuopt
Length of output: 347
🏁 Script executed:
# Check how the shared_strong_branching_context is initialized and used
rg "shared_strong_branching_context" cpp/src/pdlp/solve.cu -A 2 -B 2Repository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
# Get broader context around the subview call in solve.cu
rg "shared_sb_view.subview" cpp/src/pdlp/solve.cu -B 10 -A 5Repository: NVIDIA/cuopt
Length of output: 744
🏁 Script executed:
# Look for the loop that iterates through batches and understand i and current_batch_size
rg "for.*i.*current_batch_size|while.*current_batch_size" cpp/src/pdlp/solve.cu -B 5 -A 10Repository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
# Search for any initialization or setup of shared_sb_view to understand its size
rg "shared_sb_view\s*=" cpp/src/pdlp/solve.cu -B 5 -A 3Repository: NVIDIA/cuopt
Length of output: 484
🏁 Script executed:
# Check if there's any size validation before the batch loop
rg "solve.cu" -l | head -1 | xargs wc -lRepository: NVIDIA/cuopt
Length of output: 102
🏁 Script executed:
# Search for where run_batch_pdlp is called and how shared_sb_view is initialized
rg "run_batch_pdlp|shared_strong_branching_context_t" cpp/src/pdlp/ -B 3 -A 3 | head -50Repository: NVIDIA/cuopt
Length of output: 849
🏁 Script executed:
# Check how settings.shared_sb_view is created/passed in
rg "shared_sb_view\s*=" cpp/src/pdlp/ --type cpp --type cu -B 2 -A 2Repository: NVIDIA/cuopt
Length of output: 82
🏁 Script executed:
# Look for how num_subproblems and max_batch_size are related
rg "num_subproblems|max_batch_size" cpp/src/pdlp/solve.cu -B 2 -A 2Repository: NVIDIA/cuopt
Length of output: 2609
🏁 Script executed:
# Search for where shared_strong_branching_context_t or shared_sb_view is created/initialized
rg "shared_strong_branching_context_t|shared_sb_view" cpp/src/pdlp/ --type cpp -B 5 -A 5 | head -80Repository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
# Look at the full solve.cu function signature and understand the settings parameter
rg -A 30 "template.*run_batch_pdlp" cpp/src/pdlp/solve.cu | head -40Repository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
# Check if there are any asserts or validations about shared_sb_view size before the loop
rg "shared_sb_view" cpp/src/pdlp/solve.cu -B 20 | grep -A 20 "run_batch_pdlp"Repository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
# Let's search more broadly in the entire codebase for shared_strong_branching_context
find . -type f \( -name "*.cpp" -o -name "*.hpp" -o -name "*.cu" -o -name "*.cuh" \) -exec grep -l "shared_strong_branching_context" {} \;Repository: NVIDIA/cuopt
Length of output: 270
🏁 Script executed:
# Check the test file to understand how this structure is used
rg "shared_strong_branching_context" cpp/tests/ -B 3 -A 3Repository: NVIDIA/cuopt
Length of output: 5269
🏁 Script executed:
# Get the signature and first part of run_batch_pdlp to understand all parameters
sed -n '1,100p' cpp/src/pdlp/solve.cu | rg "run_batch_pdlp" -B 5 -A 25Repository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
# Look for any validation of shared_sb_view before it's used
rg "is_valid|shared_sb_view" cpp/src/pdlp/solve.cu -B 2 -A 2Repository: NVIDIA/cuopt
Length of output: 297
🏁 Script executed:
# Verify the exact loop bounds and how max_batch_size relates to context initialization
sed -n '1,200p' cpp/src/pdlp/solve.cu | grep -A 50 "for (size_t i = 0; i < max_batch_size"Repository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
# Check solver_settings.hpp to see how shared_sb_view is defined and documented
rg "shared_sb_view" cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp -B 3 -A 3Repository: NVIDIA/cuopt
Length of output: 496
🏁 Script executed:
# Check if there's any documentation or assertion in run_batch_pdlp about shared_sb_view size requirement
sed -n '1,300p' cpp/src/pdlp/solve.cu | tail -100Repository: NVIDIA/cuopt
Length of output: 7391
🏁 Script executed:
# Search for any size validation of shared_sb_view in the entire codebase
rg "shared_sb_view.*size|solved.*size" cpp/ -B 2 -A 2Repository: NVIDIA/cuopt
Length of output: 2476
🏁 Script executed:
# Look at how the batch_pdlp_solve function calls run_batch_pdlp to see if there's validation there
rg "batch_pdlp_solve|run_batch_pdlp" cpp/src/pdlp/solve.cu -B 5 -A 15 | head -60Repository: NVIDIA/cuopt
Length of output: 2350
🏁 Script executed:
# Look for any assertions or documentation about the size relationship
rg "shared_sb_view|max_batch_size" cpp/src/pdlp/solve.cu | grep -i "assert\|check\|expect\|size"Repository: NVIDIA/cuopt
Length of output: 1388
🏁 Script executed:
# Check pseudo_costs.cpp to see if there are any additional clues about expected usage
rg "shared_strong_branching_context" cpp/src/branch_and_bound/pseudo_costs.cpp -B 5 -A 5Repository: NVIDIA/cuopt
Length of output: 2294
🏁 Script executed:
# Verify std::span behavior by checking if there's any documentation in the code
rg "span.subspan|span\[" cpp/src/branch_and_bound/shared_strong_branching_context.hpp -B 2 -A 2Repository: NVIDIA/cuopt
Length of output: 38
Add bounds validation before accessing the shared context view.
is_solved, mark_solved, and subview directly index/subspan the shared coordination state without runtime bounds checks. The is_valid() guard only detects empty spans, not size mismatches. When run_batch_pdlp() iterates with indices derived from max_batch_size = settings.new_bounds.size(), it must validate that settings.shared_sb_view.solved.size() >= max_batch_size before the batch loop begins. Otherwise, mismatched context sizes will cause out-of-bounds access and corrupt coordination state.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/branch_and_bound/shared_strong_branching_context.hpp` around lines 37
- 43, Add runtime bounds validation before any access to the shared coordination
state: ensure that settings.shared_sb_view.solved.size() (or the span size
exposed by shared_strong_branching_context_view_t) is >= max_batch_size (where
max_batch_size = settings.new_bounds.size()) before entering run_batch_pdlp()'s
batch loop; also add guard checks in the methods is_solved(i_t),
mark_solved(i_t) and subview(offset,count) to validate their index/offset+count
against the underlying solved span size and handle/report invalid accesses
(e.g., return false/log/throw) to prevent out-of-bounds access and corrupted
coordination state.
| size_t optimal_batch_size = use_optimal_batch_size | ||
| ? detail::optimal_batch_size_handler(problem, memory_max_batch_size) | ||
| : max_batch_size; | ||
| if (settings.sub_batch_size > 0) { optimal_batch_size = settings.sub_batch_size; } | ||
| cuopt_assert(optimal_batch_size != 0 && optimal_batch_size <= max_batch_size, |
There was a problem hiding this comment.
Revalidate manual sub-batch overrides against the fitted ceiling.
settings.sub_batch_size overwrites the memory-checked value unconditionally. If the full batch does not fit and the caller forces a larger sub-batch than memory_max_batch_size, this path can still overcommit GPU memory and undo the protection above.
Suggested fix
size_t optimal_batch_size = use_optimal_batch_size
? detail::optimal_batch_size_handler(problem, memory_max_batch_size)
: max_batch_size;
- if (settings.sub_batch_size > 0) { optimal_batch_size = settings.sub_batch_size; }
+ if (settings.sub_batch_size > 0) {
+ const auto requested_sub_batch_size = static_cast<size_t>(settings.sub_batch_size);
+ cuopt_assert(requested_sub_batch_size <= memory_max_batch_size,
+ "sub_batch_size exceeds the memory-checked batch size");
+ optimal_batch_size = requested_sub_batch_size;
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| size_t optimal_batch_size = use_optimal_batch_size | |
| ? detail::optimal_batch_size_handler(problem, memory_max_batch_size) | |
| : max_batch_size; | |
| if (settings.sub_batch_size > 0) { optimal_batch_size = settings.sub_batch_size; } | |
| cuopt_assert(optimal_batch_size != 0 && optimal_batch_size <= max_batch_size, | |
| size_t optimal_batch_size = use_optimal_batch_size | |
| ? detail::optimal_batch_size_handler(problem, memory_max_batch_size) | |
| : max_batch_size; | |
| if (settings.sub_batch_size > 0) { | |
| const auto requested_sub_batch_size = static_cast<size_t>(settings.sub_batch_size); | |
| cuopt_assert(requested_sub_batch_size <= memory_max_batch_size, | |
| "sub_batch_size exceeds the memory-checked batch size"); | |
| optimal_batch_size = requested_sub_batch_size; | |
| } | |
| cuopt_assert(optimal_batch_size != 0 && optimal_batch_size <= max_batch_size, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/pdlp/solve.cu` around lines 961 - 965, The code allows
settings.sub_batch_size to unconditionally overwrite optimal_batch_size and can
exceed the memory-checked ceiling (memory_max_batch_size), risking GPU
overcommit; update the logic around optimal_batch_size (used with
detail::optimal_batch_size_handler and compared to max_batch_size) to revalidate
or clamp settings.sub_batch_size against the computed ceiling
(memory_max_batch_size or max_batch_size) before assigning, and adjust the
subsequent cuopt_assert to reflect the validated/clamped value so an oversized
manual override cannot bypass the memory check.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
cpp/include/cuopt/linear_programming/mip/solver_settings.hpp (1)
101-104: Consider replacing0/1/2mode literals with shared named constants (or enum).Using a shared enum/
constexprfor these mode values would reduce semantic drift between comments, registration bounds, and consumers.♻️ Example refactor (local to this header)
+ enum class batch_pdlp_mode_t : i_t { + dual_simplex_only = 0, + cooperative = 1, + batch_pdlp_only = 2 + }; + - i_t mip_batch_pdlp_strong_branching{ - 1}; // 0 = DS only, 1 = cooperative DS + PDLP, 2 = batch PDLP only - i_t mip_batch_pdlp_reliability_branching{ - 1}; // 0 = DS only, 1 = cooperative DS + PDLP, 2 = batch PDLP only + i_t mip_batch_pdlp_strong_branching{ + static_cast<i_t>(batch_pdlp_mode_t::cooperative)}; + i_t mip_batch_pdlp_reliability_branching{ + static_cast<i_t>(batch_pdlp_mode_t::cooperative)};🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/include/cuopt/linear_programming/mip/solver_settings.hpp` around lines 101 - 104, Replace the numeric mode literals for mip_batch_pdlp_strong_branching and mip_batch_pdlp_reliability_branching with a shared named enum or constexpr constants (e.g., Mode::DS_ONLY, Mode::COOPERATIVE_DS_PDLP, Mode::BATCH_PDLP_ONLY) declared in this header; update the variable types/initializers to use those named constants, update the comments to refer to the enum names, and ensure any registration/validation logic that currently checks 0/1/2 uses the enum values so bounds and consumers remain consistent with the new symbols.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@cpp/include/cuopt/linear_programming/mip/solver_settings.hpp`:
- Around line 101-104: Replace the numeric mode literals for
mip_batch_pdlp_strong_branching and mip_batch_pdlp_reliability_branching with a
shared named enum or constexpr constants (e.g., Mode::DS_ONLY,
Mode::COOPERATIVE_DS_PDLP, Mode::BATCH_PDLP_ONLY) declared in this header;
update the variable types/initializers to use those named constants, update the
comments to refer to the enum names, and ensure any registration/validation
logic that currently checks 0/1/2 uses the enum values so bounds and consumers
remain consistent with the new symbols.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ea0863f1-03fa-448e-a445-e8836db6e514
📒 Files selected for processing (5)
cpp/include/cuopt/linear_programming/constants.hcpp/include/cuopt/linear_programming/mip/solver_settings.hppcpp/include/cuopt/linear_programming/pdlp/solver_settings.hppcpp/src/math_optimization/solver_settings.cucpp/src/pdlp/solve.cu
✅ Files skipped from review due to trivial changes (1)
- cpp/src/pdlp/solve.cu
🚧 Files skipped from review as they are similar to previous changes (3)
- cpp/include/cuopt/linear_programming/constants.h
- cpp/src/math_optimization/solver_settings.cu
- cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp
|
/ok to test 496c4fd |
There was a problem hiding this comment.
♻️ Duplicate comments (1)
cpp/src/branch_and_bound/pseudo_costs.cpp (1)
158-172:⚠️ Potential issue | 🟡 MinorMinor: Typo in comment.
Line 160 has "iteartion" instead of "iteration".
- // We could not mark as solved nodes hitting iteartion limit in DS + // We could not mark as solved nodes hitting iteration limit in DS🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 158 - 172, Fix the simple typo in the comment inside the block guarded by sb_view.is_valid() in pseudo_costs.cpp: change "iteartion" to "iteration" in the comment that reads "We could not mark as solved nodes hitting iteartion limit in DS"; the surrounding code uses sb_view.mark_solved(shared_idx) and ds_is_valid_done(ds_status_down[k]) / ds_is_valid_done(ds_status_up[k]) so update only the comment text to "iteration" to correct the spelling.
🧹 Nitpick comments (3)
cpp/src/branch_and_bound/pseudo_costs.cpp (3)
502-505: Redundant assert before identical if-check.The assert on line 502 checks the same condition as the if-check on line 505. Since the code inside the if-block is the only code path anyway, the assert is redundant.
- assert(!pc.pdlp_warm_cache.populated && - "PDLP warm cache should not be populated at this point"); - - if (!pc.pdlp_warm_cache.populated) { + if (!pc.pdlp_warm_cache.populated) { + // Expected: cache is not populated on first entry🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 502 - 505, Remove the redundant assert that checks pc.pdlp_warm_cache.populated immediately before the identical if-check; specifically delete the assert(!pc.pdlp_warm_cache.populated && "PDLP warm cache should not be populated at this point") so only the existing if (!pc.pdlp_warm_cache.populated) { ... } remains (leave the if-block and its logic unchanged).
1283-1301: Address TODO: Remove or restore commented-out logging code.This block of commented-out logging code should either be restored or removed before merging. If the statistics are useful, consider re-enabling them; otherwise, clean up the dead code.
Would you like me to help restore this logging block with the current variable names, or should it be removed entirely?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 1283 - 1301, Restore the commented-out RB Dual Simplex logging: re-enable the conditional block around rb_mode (use rb_mode != 2 check, with the inner rb_mode == 1 branch) and call log.printf with the current variables ds_elapsed (from toc(ds_start_time)), num_candidates, ds_optimal.load(), ds_infeasible.load(), ds_failed.load(), and ds_skipped.load() exactly as in the diff; ensure the two format strings match their branches (one including ds_skipped and “(PDLP)”, the other without it) and that values use num_candidates * 2 for the denominators as shown, keeping correct punctuation and line breaks so the code compiles.
366-373: Consider using epsilon comparison for floating-point sigma values.Exact floating-point comparison (
sigma == -1,sigma == 1) can fail ifsigmahas any numerical noise from matrix operations. While slack coefficients are typically constructed as exact +1/-1, an epsilon-based comparison would be more robust.- if (sigma == -1) { + constexpr f_t eps = 1e-9; + if (std::abs(sigma - (-1.0)) < eps) { constraint_lower[i] = slack_lower + lp.rhs[i]; constraint_upper[i] = slack_upper + lp.rhs[i]; - } else if (sigma == 1) { + } else if (std::abs(sigma - 1.0) < eps) { constraint_lower[i] = -slack_upper + lp.rhs[i]; constraint_upper[i] = -slack_lower + lp.rhs[i]; } else { - assert(sigma == 1.0 || sigma == -1.0); + assert(false && "sigma must be +1 or -1"); }As per coding guidelines: "use epsilon comparisons for floating-point equality checks"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 366 - 373, The code compares floating-point sigma to exact integers; replace direct comparisons in the block handling sigma with epsilon-based checks (e.g., fabs(sigma - 1.0) < EPS and fabs(sigma + 1.0) < EPS) when assigning constraint_lower[i] and constraint_upper[i] using slack_lower/slack_upper and lp.rhs[i]; define or reuse a small EPS constant and update the final assert to assert(fabs(sigma - 1.0) < EPS || fabs(sigma + 1.0) < EPS) so the logic in this function (pseudo_costs.cpp handling sigma, constraint_lower, constraint_upper, slack_lower, slack_upper) is robust to floating-point noise.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@cpp/src/branch_and_bound/pseudo_costs.cpp`:
- Around line 158-172: Fix the simple typo in the comment inside the block
guarded by sb_view.is_valid() in pseudo_costs.cpp: change "iteartion" to
"iteration" in the comment that reads "We could not mark as solved nodes hitting
iteartion limit in DS"; the surrounding code uses
sb_view.mark_solved(shared_idx) and ds_is_valid_done(ds_status_down[k]) /
ds_is_valid_done(ds_status_up[k]) so update only the comment text to "iteration"
to correct the spelling.
---
Nitpick comments:
In `@cpp/src/branch_and_bound/pseudo_costs.cpp`:
- Around line 502-505: Remove the redundant assert that checks
pc.pdlp_warm_cache.populated immediately before the identical if-check;
specifically delete the assert(!pc.pdlp_warm_cache.populated && "PDLP warm cache
should not be populated at this point") so only the existing if
(!pc.pdlp_warm_cache.populated) { ... } remains (leave the if-block and its
logic unchanged).
- Around line 1283-1301: Restore the commented-out RB Dual Simplex logging:
re-enable the conditional block around rb_mode (use rb_mode != 2 check, with the
inner rb_mode == 1 branch) and call log.printf with the current variables
ds_elapsed (from toc(ds_start_time)), num_candidates, ds_optimal.load(),
ds_infeasible.load(), ds_failed.load(), and ds_skipped.load() exactly as in the
diff; ensure the two format strings match their branches (one including
ds_skipped and “(PDLP)”, the other without it) and that values use
num_candidates * 2 for the denominators as shown, keeping correct punctuation
and line breaks so the code compiles.
- Around line 366-373: The code compares floating-point sigma to exact integers;
replace direct comparisons in the block handling sigma with epsilon-based checks
(e.g., fabs(sigma - 1.0) < EPS and fabs(sigma + 1.0) < EPS) when assigning
constraint_lower[i] and constraint_upper[i] using slack_lower/slack_upper and
lp.rhs[i]; define or reuse a small EPS constant and update the final assert to
assert(fabs(sigma - 1.0) < EPS || fabs(sigma + 1.0) < EPS) so the logic in this
function (pseudo_costs.cpp handling sigma, constraint_lower, constraint_upper,
slack_lower, slack_upper) is robust to floating-point noise.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c9d7be61-6bc3-4400-aa36-b21d1943fd39
📒 Files selected for processing (1)
cpp/src/branch_and_bound/pseudo_costs.cpp
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/src/branch_and_bound/shared_strong_branching_context.hpp`:
- Around line 49-53: The bounds check in subview can overflow when adding signed
i_t values; replace the single assert(offset + count <= solved.size()) with a
safe check that prevents overflow by validating non-negativity (if i_t is
signed) and comparing sizes using the same unsigned type as solved.size():
ensure offset >= 0 and count >= 0 (if applicable), then cast to the span-size
type (e.g. std::size_t) and assert(static_cast<size_type>(offset) <=
solved.size() && static_cast<size_type>(count) <= solved.size() -
static_cast<size_type>(offset)); also use those casts when calling
solved.subspan(offset, count) so subview uses consistent unsigned indices.
- Around line 37-47: The bounds checks in is_solved and mark_solved are unsafe
if i_t is signed because a negative local_idx will convert to a large unsigned
when compared to solved.size() and assert may be disabled in release; update
both methods (is_solved and mark_solved) to first check that local_idx is
non-negative and then compare static_cast<size_t>(local_idx) against
solved.size(), and handle failures with a runtime-safe response (e.g., throw
std::out_of_range or return/abort) instead of relying solely on assert so access
to solved[local_idx] is always safe.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 4f7a71dc-8774-4f65-90e4-9dd5c98b0e0a
📒 Files selected for processing (1)
cpp/src/branch_and_bound/shared_strong_branching_context.hpp
| bool is_solved(i_t local_idx) const | ||
| { | ||
| assert(local_idx < solved.size() && "local_idx out of bounds"); | ||
| return solved[local_idx].load() != 0; | ||
| } | ||
|
|
||
| void mark_solved(i_t local_idx) const | ||
| { | ||
| assert(local_idx < solved.size() && "local_idx out of bounds"); | ||
| solved[local_idx].store(1); | ||
| } |
There was a problem hiding this comment.
Signed/unsigned comparison hazard if i_t is signed.
If i_t is a signed type (e.g., int), a negative local_idx value will implicitly convert to a large unsigned value when compared against solved.size(), bypassing the bounds check and causing undefined behavior on access. Additionally, assert is disabled in release builds, leaving no runtime protection.
Consider adding an explicit non-negative check or casting to ensure safe comparison:
🛡️ Proposed fix
bool is_solved(i_t local_idx) const
{
- assert(local_idx < solved.size() && "local_idx out of bounds");
+ assert(local_idx >= 0 && static_cast<size_t>(local_idx) < solved.size() && "local_idx out of bounds");
return solved[local_idx].load() != 0;
}
void mark_solved(i_t local_idx) const
{
- assert(local_idx < solved.size() && "local_idx out of bounds");
+ assert(local_idx >= 0 && static_cast<size_t>(local_idx) < solved.size() && "local_idx out of bounds");
solved[local_idx].store(1);
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/branch_and_bound/shared_strong_branching_context.hpp` around lines 37
- 47, The bounds checks in is_solved and mark_solved are unsafe if i_t is signed
because a negative local_idx will convert to a large unsigned when compared to
solved.size() and assert may be disabled in release; update both methods
(is_solved and mark_solved) to first check that local_idx is non-negative and
then compare static_cast<size_t>(local_idx) against solved.size(), and handle
failures with a runtime-safe response (e.g., throw std::out_of_range or
return/abort) instead of relying solely on assert so access to solved[local_idx]
is always safe.
There was a problem hiding this comment.
Thanks for the hard work, Nicolas!
My main concerns are:
-
Do not create a
std::threadat each call to the reliability branching. Instead, useomp taskto integrate with the rest of B&B, share the CPU resources more efficiently and avoid oversubscribing the CPU. -
Since two or more threads might call the reliability branching at the same time, some variables may appear in multiple unreliable lists. Yet, a single call to strong branching can be enough to turn the variable into reliable. If this is the case, then we can propagate the score from one thread to another rather than calling strong branching again. That is why there is a mutex per variable that locks when performing a trial branching. When using batch PDLP, we also need to consider this case and mark the variable accordingly. Multiple calls to strong branching is not necessarily a bad thing (more information about a variable can lead to better decisions). However, it may be too expensive.
-
Considering that there are many simultaneous calls to batch PDLP, are the kernels inserted in the same CUDA stream or each call goes to a separated stream? The latter may be better for performance, but might mess up with the determinism.
|
|
||
| namespace { | ||
|
|
||
| static bool ds_is_valid_done(dual::status_t status) |
There was a problem hiding this comment.
Nitpick: ds_is_valid_done --> is_dual_simplex_done
There was a problem hiding this comment.
Thanks I will update that
| template <typename i_t, typename f_t> | ||
| void strong_branching(const user_problem_t<i_t, f_t>& original_problem, | ||
| const lp_problem_t<i_t, f_t>& original_lp, | ||
| static std::pair<f_t, i_t> merge_sb_result(f_t ds_val, |
There was a problem hiding this comment.
Nitpick: use an enum instead of a int to set the source. Avoid setting an incorrect value in the future and the intent of the return is more clear
| const std::vector<i_t>& new_slacks, | ||
| const std::vector<variable_type_t>& var_types, | ||
| const std::vector<f_t> root_soln, | ||
| const std::vector<f_t>& root_soln, |
There was a problem hiding this comment.
Note that in #979, I am setting the root_soln to lp_solution such that x, y and z are already available.
There was a problem hiding this comment.
Yes if your PR gets merged first I will use it instead
| const std::vector<f_t>& edge_norms, | ||
| pseudo_costs_t<i_t, f_t>& pc) | ||
| pseudo_costs_t<i_t, f_t>& pc, | ||
| std::vector<f_t>& ds_obj_down, |
There was a problem hiding this comment.
I would avoid calling dual simplex as ds. We do not use this terminology on the other parts of the code base
There was a problem hiding this comment.
Similarly, strong_branching vs sb or reliability branching vs rb.
| fractional.size()); | ||
|
|
||
| // Cooperative DS + PDLP: shared context tracks which subproblems are solved | ||
| shared_strong_branching_context_t<i_t, f_t> shared_ctx(2 * fractional.size()); |
There was a problem hiding this comment.
I have a question: why do we need 2 structs here? Can you just pass a pointer or reference to shared_strong_branching_context_t instead?
There was a problem hiding this comment.
The original idea for the view version was to have sub_views as in BPDLP I might solve parts of the batch one after another and the sub view would abstract the initial offset
| } else if (rb_mode != 0 && unreliable_list.size() < min_num_candidates_for_pdlp) { | ||
| log.printf("Not enough candidates to use batch PDLP, using DS only\n"); | ||
| } else if (rb_mode != 0 && pdlp_warm_cache.pourcent_solved_by_batch_pdlp_at_root < 5.0) { | ||
| log.printf("Pourcent solved by batch PDLP at root is too low, using DS only\n"); |
There was a problem hiding this comment.
Typo: Pourcent --> Percent
There was a problem hiding this comment.
You need to check for other places as well
| const int num_tasks = std::max(max_num_tasks, 1); | ||
| const int task_priority = reliability_branching_settings.task_priority; | ||
| const i_t max_num_candidates = reliability_branching_settings.max_num_candidates; | ||
| const i_t rb_mode = settings.mip_batch_pdlp_reliability_branching; |
There was a problem hiding this comment.
I would suggest to move reliability branching with batch PDLP to another routine. Keep the original with just dual simplex
There was a problem hiding this comment.
Also, it is important to consider that we are setting the reliability threshold based on the number of dual simplex iterations in reliability branching and B&B. So we may need a way to add the PDLP iterations to the calculation as well
| std::vector<f_t> pdlp_obj_up(num_candidates, std::numeric_limits<f_t>::quiet_NaN()); | ||
|
|
||
| std::atomic<int> concurrent_halt{0}; | ||
| std::thread pdlp_thread; |
There was a problem hiding this comment.
I would avoid creating a std::thread here for three reasons:
- The entire B&B is using OpenMP, so it is best to use the same model throughout the entire solver.
- This code section is executed in parallel by a large quantity of threads, such that we are essentially duplicating the number of executing threads. This is very bad for the performance.
- This routine is very frequently called, such that creating and destroying a thread may lead to a significant overhead.
There was a problem hiding this comment.
In other words, replace std::thread with omp task
…ather than thread, put both bpdlp call in functions
…ble, use one stream per BPDLP in RB
|
/ok to test 6e5baa5 |
This is still a draft PR. Needs extra work. Just for early review.
This PR enables the following regarding batch PDLP: