Skip to content

Race batch PDLP and dual simplex in strong branching / reliability branching#994

Open
Kh4ster wants to merge 52 commits intorelease/26.04from
race_strong_branching_realibility_branching
Open

Race batch PDLP and dual simplex in strong branching / reliability branching#994
Kh4ster wants to merge 52 commits intorelease/26.04from
race_strong_branching_realibility_branching

Conversation

@Kh4ster
Copy link
Copy Markdown
Contributor

@Kh4ster Kh4ster commented Mar 25, 2026

This is still a draft PR. Needs extra work. Just for early review.

This PR enables the following regarding batch PDLP:

  • Enable batch PDLP in reliability branching
  • Add work stealing so that batch PDLP and Dual Simplex can run concurrently and steal LPs from each other if one solves it first
  • Use correct problem representation with cuts for batch PDLP
  • Use a PDLP warm start cache across strong branching at the root and in reliability branching
  • Increase tolerance on batch PDLP to have higher quality solution
  • Increase iteration limit to allow instances that needs a high iteration count (with low cost per iteration) to still come through (only while solving the original LP to get warm start data)
  • Multiple heuristics to not run batch PDLP to not create overheads when Dual Simplex is clearly superior
  • Don't store and copy primal dual solution unless need it to save on memory
  • Handle batch PDLP errors better, allowing Dual Simplex to still continue in strong branching even if BPDLP fails

Kh4ster and others added 30 commits February 9, 2026 17:47
…te_lp_problem' into race_strong_branching_realibility_branching
…ng, correctly fill the ds_obj objective before merging results at the root, correctly clamp the PDLP objective, remove the unucessary cuopt_assert regarding fixed point error
@Kh4ster Kh4ster requested a review from nguidotti March 25, 2026 17:27
@Kh4ster Kh4ster self-assigned this Mar 25, 2026
@Kh4ster Kh4ster added non-breaking Introduces a non-breaking change improvement Improves an existing functionality pdlp mip labels Mar 25, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 25, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds cooperative Dual Simplex ⇄ Batch PDLP strong/reliability branching, new tri-state parameters/macros, shared strong-branching context types, PDLP warm-start / initial-iteration controls, coordinated concurrent termination and optional per-batch solution collection, plus API/signature updates across branching, PDLP, solver settings, tests, and build files.

Changes

Cohort / File(s) Summary
MIP Parameter Constants
cpp/include/cuopt/linear_programming/constants.h, cpp/src/math_optimization/solver_settings.cu
Added CUOPT_MIP_BATCH_PDLP_RELIABILITY_BRANCHING macro/parameter; widened CUOPT_MIP_BATCH_PDLP_STRONG_BRANCHING range to 0..2 and changed its default to 1.
MIP Solver Settings (Public API)
cpp/include/cuopt/linear_programming/mip/solver_settings.hpp
Changed mip_batch_pdlp_strong_branching default to 1 and added new mip_batch_pdlp_reliability_branching member.
Dual Simplex / Simplex Settings
cpp/src/dual_simplex/simplex_solver_settings.hpp
Changed mip_batch_pdlp_strong_branching semantics to tri-state and added mip_batch_pdlp_reliability_branching field (tri-state).
PDLP Solver Settings (API & Impl)
cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp, cpp/src/pdlp/solver_settings.cu
Added initial-PDLP-iteration accessors (set/get_initial_pdlp_iteration), shared_sb_view, generate_batch_primal_dual_solution, sub_batch_size; strengthened validations for initial step/weight setters.
Shared Strong Branching Context
cpp/src/branch_and_bound/shared_strong_branching_context.hpp
New shared_strong_branching_context_t and shared_strong_branching_context_view_t for atomic per-subproblem solved-state sharing and subviews.
Pseudo-Costs & Strong Branching (API + Impl)
cpp/src/branch_and_bound/pseudo_costs.hpp, cpp/src/branch_and_bound/pseudo_costs.cpp, cpp/src/branch_and_bound/branch_and_bound.cpp
Refactored strong-branching signatures to pass new_slacks and root dual components; trial_branching now returns (objective,status); added batch_pdlp_warm_cache_t; implemented cooperative PDLP/DS parallel launch, merge logic, and updated call sites.
PDLP Core Solver & Batch Sizing
cpp/src/pdlp/pdlp.cu, cpp/src/pdlp/solve.cu
Integrated shared-context checks to mark/skip externally-solved climbers; switched concurrent halt checks to atomic loads; added initial-iteration/warm-start sourcing; memory estimator uses double and larger safety factor; batch sizing returns NumericalError on unsatisfiable size; optional per-batch solution collection and sub-batch sizing added.
PDLP Termination & Restart
cpp/src/pdlp/termination_strategy/termination_strategy.hpp, cpp/src/pdlp/termination_strategy/termination_strategy.cu, cpp/src/pdlp/restart_strategy/pdlp_restart_strategy.cu
Added set_termination_status(id,status); treat ConcurrentLimit as terminal in is_done; added debug output in restart check.
MIP Integration & Build
cpp/src/mip_heuristics/solver.cu, cpp/src/mip_heuristics/CMakeLists.txt, cpp/src/pdlp/translate.hpp
Propagated mip_batch_pdlp_reliability_branching into branch-and-bound settings; moved clique_table.cu into LP-required list; added MIP include to PDLP translate header.
Tests
cpp/tests/linear_programming/pdlp_test.cu
Enabled generation of primal/dual outputs and added cooperative strong-branching tests covering shared-context view, sub-batching, threaded interaction, and termination-status expectations.
Python Config
python/cuopt_server/cuopt_server/utils/linear_programming/data_definition.py
Changed mip_batch_pdlp_strong_branching default to 1 and description to tri-state; added mip_batch_pdlp_reliability_branching field (default 1, tri-state).
Miscellaneous Includes / Low-risk changes
cpp/src/pdlp/pdlp_constants.hpp, cpp/src/pdlp/utilities/ping_pong_graph.cu
Removed unused include from pdlp_constants.hpp; added RAFT CUDA utils include for error macros in ping_pong_graph.cu.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 2.38% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: enabling batch PDLP racing with Dual Simplex in strong branching and reliability branching.
Description check ✅ Passed The description is directly related to the changeset, detailing multiple aspects of batch PDLP integration into strong/reliability branching.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch race_strong_branching_realibility_branching

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cpp/src/branch_and_bound/pseudo_costs.cpp (1)

104-109: ⚠️ Potential issue | 🔴 Critical

Don't derive early-stopped simplex bounds from solution.x.

For ITERATION_LIMIT, dual simplex is still useful here because it preserves a valid lower bound, but compute_objective(child_problem, solution.x) is not that bound. Feeding c^T x from a non-optimal iterate into strong-branch scores and pseudo-cost updates can bias branching decisions.

Based on learnings, cuOpt dual simplex is dual-safe, so the safe lower bound at ITERATION_LIMIT is the solver’s maintained objective value, not an objective recomputed from the current primal iterate.

Also applies to: 257-259

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 104 - 109, The code
sets obj = compute_objective(child_problem, solution.x) when status ==
dual::status_t::ITERATION_LIMIT, which incorrectly uses a non‑optimal primal
iterate; instead fetch and use the solver's maintained objective/lower bound for
the iterate (the dual‑safe objective value the solver preserves) rather than
recomputing c^T x from solution.x. Locate the status checks
(dual::status_t::ITERATION_LIMIT) and replace the
compute_objective(child_problem, solution.x) use with the solver‑reported
objective/lower bound (e.g., the solution's maintained objective field or a
solver API that returns the current bound) in both the block around
compute_objective and the other occurrence that mirrors lines 257-259 so
pseudo‑cost updates and strong‑branch scoring use the dual‑safe bound.
🧹 Nitpick comments (4)
cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp (1)

21-22: Consider moving the shared strong-branching view to a solver-neutral header.

pdlp_solver_settings_t now pulls a branch_and_bound/ header into the PDLP settings layer just to carry a coordination view. Relocating that type to a shared LP module would keep the dependency direction cleaner and make PDLP easier to reuse outside branch-and-bound.

Based on learnings "Reduce tight coupling between solver components (presolve, simplex, basis, barrier); increase modularity and reusability of optimization algorithms"

Also applies to: 277-278

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp` around lines
21 - 22, The PDLP settings header currently includes
branch_and_bound/shared_strong_branching_context.hpp to obtain the coordination
view; instead move the shared strong-branching view type out of branch_and_bound
into a solver-neutral header in the shared LP module (e.g.,
shared_lp/strong_branching_view.hpp) and update pdlp_solver_settings_t to
include that new header rather than the branch_and_bound one; also update any
other files that currently include shared_strong_branching_context.hpp (refs:
pdlp_solver_settings_t and the other occurrences noted) to include the new
solver-neutral header so PDLP no longer depends on branch_and_bound.
cpp/src/branch_and_bound/shared_strong_branching_context.hpp (1)

37-42: Consider relaxed memory ordering for potential performance improvement.

The current default memory_order_seq_cst is correct but may have unnecessary overhead. Since the solved flag is only used as a hint (no other data depends on it), relaxed ordering would suffice:

♻️ Optional: Use relaxed memory ordering
   bool is_solved(i_t local_idx) const
   {
-    return solved[local_idx].load() != 0;
+    return solved[local_idx].load(std::memory_order_relaxed) != 0;
   }

-  void mark_solved(i_t local_idx) const { solved[local_idx].store(1); }
+  void mark_solved(i_t local_idx) const { solved[local_idx].store(1, std::memory_order_relaxed); }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/shared_strong_branching_context.hpp` around lines 37
- 42, The solved atomic currently uses default seq_cst ordering in is_solved and
mark_solved; change them to use relaxed ordering since solved[] is only a hint
and no other data depends on it: update is_solved to call
solved[local_idx].load(std::memory_order_relaxed) and update mark_solved to call
solved[local_idx].store(1, std::memory_order_relaxed) (references: is_solved,
mark_solved, solved).
cpp/tests/linear_programming/pdlp_test.cu (2)

2310-2342: Same memory leak risk with raw pointer pattern.

This test has the same raw new/delete pattern as shared_sb_view_concurrent_mark. Apply the same std::unique_ptr fix for exception safety.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/linear_programming/pdlp_test.cu` around lines 2310 - 2342, Replace
the raw new/delete pattern with a std::unique_ptr to ensure exception safety:
change result_ptr from a raw pointer to
std::unique_ptr<optimization_problem_solution_t<int,double>> result_ptr; inside
the pdlp_thread lambda assign result_ptr =
std::make_unique<optimization_problem_solution_t<int,double>>(solve_lp(&handle_,
op_problem, solver_settings)); remove the explicit delete result_ptr; keep the
existing dereference (auto& solution = *result_ptr) and joins unchanged; this
preserves semantics while preventing leaks.

2244-2278: Consider using std::unique_ptr to prevent memory leaks on exceptions.

The raw new/delete pattern risks leaking memory if an exception is thrown between allocation and cleanup. Additionally, the 200ms sleep-based synchronization may cause test flakiness under varying system loads.

♻️ Suggested improvement using unique_ptr
-  optimization_problem_solution_t<int, double>* result_ptr = nullptr;
+  std::unique_ptr<optimization_problem_solution_t<int, double>> result_ptr;

   auto pdlp_thread = std::thread([&]() {
-    auto sol = new optimization_problem_solution_t<int, double>(
-      solve_lp(&handle_, op_problem, solver_settings));
-    result_ptr = sol;
+    result_ptr = std::make_unique<optimization_problem_solution_t<int, double>>(
+      solve_lp(&handle_, op_problem, solver_settings));
   });

   // ...

   pdlp_thread.join();

   ASSERT_NE(result_ptr, nullptr);
-  auto& solution = *result_ptr;
+  auto& solution = *result_ptr;

   // ... assertions ...

-  delete result_ptr;
+  // unique_ptr handles cleanup automatically
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/linear_programming/pdlp_test.cu` around lines 2244 - 2278, Replace
the raw new/delete and sleep-based sync: allocate the solution into a
std::unique_ptr<optimization_problem_solution_t<int,double>> and transfer it out
of the thread via a std::promise<std::unique_ptr<...>>/std::future to avoid
leaks if exceptions occur (replace result_ptr and the new in the lambda with
promise.set_value(std::move(ptr))). Replace the fixed
std::this_thread::sleep_for with a deterministic synchronization that waits for
the driver simulation to mark entries solved (e.g., use a condition_variable or
wait on the atomic ctx.solved flags with a short timeout loop) so the test does
not rely on timing; update references to result_ptr to use the future’s returned
unique_ptr and remove manual delete.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/src/branch_and_bound/pseudo_costs.cpp`:
- Around line 517-520: The debug block refers to a non-existent identifier
original_solution; replace it with the correct variable that holds the solve
result in this scope (the one that exposes
get_pdlp_warm_start_data().total_pdlp_iterations_), e.g., use the actual
solution variable name used elsewhere in this function (search for the variable
that calls get_pdlp_warm_start_data()) so the BATCH_VERBOSE_MODE print uses a
valid identifier and compiles.
- Around line 949-957: The reliability-branching path allows batch PDLP when it
should be disabled for nested sub-MIP runs; update the use_pdlp guard to also
check settings.sub_mip is false before starting/using batch PDLP and the
pdlp_warm_cache (i.e., include a check against settings.sub_mip in the boolean
expression that defines use_pdlp so batch PDLP/warm-cache reuse is skipped when
settings.sub_mip is true in the reliability-branching code that computes
use_pdlp/pdlp_warm_cache).

In `@cpp/src/branch_and_bound/pseudo_costs.hpp`:
- Around line 411-420: The PDLP warm-start cache
(batch_pdlp_warm_cache_t::populated and its device_uvector/scalar fields) is
never invalidated, causing stale warm-start data to persist across different
solves; add a reset() method on batch_pdlp_warm_cache_t that clears or
reinitializes initial_primal and initial_dual (release or resize to 0 on the
RAFT/RMM stream), sets populated=false, and resets
step_size/primal_weight/pdlp_iteration/pourcent_solved_by_batch_pdlp_at_root to
NaN/initial values, then ensure that pdlp_warm_cache.reset() is invoked from the
solver’s solve-start path (e.g., inside pseudo_costs_t::reset or the method that
invalidates LP/cut context) so the cache is cleared whenever the LP/cut
configuration changes.

In `@cpp/src/math_optimization/solver_settings.cu`:
- Around line 102-103: The registry metadata entries for
CUOPT_MIP_BATCH_PDLP_STRONG_BRANCHING and
CUOPT_MIP_BATCH_PDLP_RELIABILITY_BRANCHING advertise a default of 0 while
mip_solver_settings_t now default-constructs those knobs to 1; update the
metadata in solver_settings.cu so the default values for the entries referencing
mip_settings.mip_batch_pdlp_strong_branching and
mip_settings.mip_batch_pdlp_reliability_branching are changed from 0 to 1 to
match the constructor defaults and ensure reset/metadata reflect the true
initial state.

In `@cpp/src/pdlp/solver_settings.cu`:
- Around line 351-355: The setter
pdlp_solver_settings_t<i_t,f_t>::set_initial_pdlp_iteration currently accepts
any value; add a validation that rejects negative iterations by checking if
initial_pdlp_iteration < 0 and throwing a std::invalid_argument (with a clear
message) when negative, otherwise assign initial_pdlp_iteration_ =
std::make_optional(initial_pdlp_iteration); ensure the check uses a signed
comparison or constrained type handling so it behaves correctly for signed i_t.

In `@cpp/src/pdlp/termination_strategy/termination_strategy.cu`:
- Around line 127-132: The setter pdlp_termination_strategy_t<i_t,
f_t>::set_termination_status writes termination_status_[id] without validating
id; add an index bounds guard (e.g., compare id against
termination_status_.size() or a stored max_id) before the write and handle
out-of-range cases (assert/throw/log-and-return) to avoid corrupting state when
id is invalid, using the existing parameter names (id, status) and the member
termination_status_.

In
`@python/cuopt_server/cuopt_server/utils/linear_programming/data_definition.py`:
- Around line 454-463: The defaults for mip_batch_pdlp_strong_branching and
mip_batch_pdlp_reliability_branching currently use default=0 which disables
cooperative/racing branching; change their defaults to enable cooperative
behavior by default (set default=1) so server requests that omit these fields
use the intended cooperative/batch PDLP mode, and adjust any related
docstring/description text if needed to reflect the new default.

---

Outside diff comments:
In `@cpp/src/branch_and_bound/pseudo_costs.cpp`:
- Around line 104-109: The code sets obj = compute_objective(child_problem,
solution.x) when status == dual::status_t::ITERATION_LIMIT, which incorrectly
uses a non‑optimal primal iterate; instead fetch and use the solver's maintained
objective/lower bound for the iterate (the dual‑safe objective value the solver
preserves) rather than recomputing c^T x from solution.x. Locate the status
checks (dual::status_t::ITERATION_LIMIT) and replace the
compute_objective(child_problem, solution.x) use with the solver‑reported
objective/lower bound (e.g., the solution's maintained objective field or a
solver API that returns the current bound) in both the block around
compute_objective and the other occurrence that mirrors lines 257-259 so
pseudo‑cost updates and strong‑branch scoring use the dual‑safe bound.

---

Nitpick comments:
In `@cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp`:
- Around line 21-22: The PDLP settings header currently includes
branch_and_bound/shared_strong_branching_context.hpp to obtain the coordination
view; instead move the shared strong-branching view type out of branch_and_bound
into a solver-neutral header in the shared LP module (e.g.,
shared_lp/strong_branching_view.hpp) and update pdlp_solver_settings_t to
include that new header rather than the branch_and_bound one; also update any
other files that currently include shared_strong_branching_context.hpp (refs:
pdlp_solver_settings_t and the other occurrences noted) to include the new
solver-neutral header so PDLP no longer depends on branch_and_bound.

In `@cpp/src/branch_and_bound/shared_strong_branching_context.hpp`:
- Around line 37-42: The solved atomic currently uses default seq_cst ordering
in is_solved and mark_solved; change them to use relaxed ordering since solved[]
is only a hint and no other data depends on it: update is_solved to call
solved[local_idx].load(std::memory_order_relaxed) and update mark_solved to call
solved[local_idx].store(1, std::memory_order_relaxed) (references: is_solved,
mark_solved, solved).

In `@cpp/tests/linear_programming/pdlp_test.cu`:
- Around line 2310-2342: Replace the raw new/delete pattern with a
std::unique_ptr to ensure exception safety: change result_ptr from a raw pointer
to std::unique_ptr<optimization_problem_solution_t<int,double>> result_ptr;
inside the pdlp_thread lambda assign result_ptr =
std::make_unique<optimization_problem_solution_t<int,double>>(solve_lp(&handle_,
op_problem, solver_settings)); remove the explicit delete result_ptr; keep the
existing dereference (auto& solution = *result_ptr) and joins unchanged; this
preserves semantics while preventing leaks.
- Around line 2244-2278: Replace the raw new/delete and sleep-based sync:
allocate the solution into a
std::unique_ptr<optimization_problem_solution_t<int,double>> and transfer it out
of the thread via a std::promise<std::unique_ptr<...>>/std::future to avoid
leaks if exceptions occur (replace result_ptr and the new in the lambda with
promise.set_value(std::move(ptr))). Replace the fixed
std::this_thread::sleep_for with a deterministic synchronization that waits for
the driver simulation to mark entries solved (e.g., use a condition_variable or
wait on the atomic ctx.solved flags with a short timeout loop) so the test does
not rely on timing; update references to result_ptr to use the future’s returned
unique_ptr and remove manual delete.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8007b7e7-5e76-42d9-b05d-cd681c53cf23

📥 Commits

Reviewing files that changed from the base of the PR and between f80d6cf and b0061e4.

📒 Files selected for processing (20)
  • cpp/include/cuopt/linear_programming/constants.h
  • cpp/include/cuopt/linear_programming/mip/solver_settings.hpp
  • cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp
  • cpp/src/branch_and_bound/branch_and_bound.cpp
  • cpp/src/branch_and_bound/pseudo_costs.cpp
  • cpp/src/branch_and_bound/pseudo_costs.hpp
  • cpp/src/branch_and_bound/shared_strong_branching_context.hpp
  • cpp/src/dual_simplex/simplex_solver_settings.hpp
  • cpp/src/math_optimization/solver_settings.cu
  • cpp/src/mip_heuristics/CMakeLists.txt
  • cpp/src/mip_heuristics/solver.cu
  • cpp/src/pdlp/pdlp.cu
  • cpp/src/pdlp/restart_strategy/pdlp_restart_strategy.cu
  • cpp/src/pdlp/solve.cu
  • cpp/src/pdlp/solver_settings.cu
  • cpp/src/pdlp/termination_strategy/termination_strategy.cu
  • cpp/src/pdlp/termination_strategy/termination_strategy.hpp
  • cpp/src/pdlp/translate.hpp
  • cpp/tests/linear_programming/pdlp_test.cu
  • python/cuopt_server/cuopt_server/utils/linear_programming/data_definition.py

Comment on lines +411 to +420
template <typename i_t, typename f_t>
struct batch_pdlp_warm_cache_t {
const raft::handle_t batch_pdlp_handle{};
rmm::device_uvector<f_t> initial_primal{0, batch_pdlp_handle.get_stream()};
rmm::device_uvector<f_t> initial_dual{0, batch_pdlp_handle.get_stream()};
f_t step_size{std::numeric_limits<f_t>::signaling_NaN()};
f_t primal_weight{std::numeric_limits<f_t>::signaling_NaN()};
i_t pdlp_iteration{-1};
f_t pourcent_solved_by_batch_pdlp_at_root{f_t(0.0)};
bool populated{false};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Invalidate the PDLP warm-start cache when the LP context changes.

pdlp_warm_cache becomes sticky once populated is set, but this new persistent state has no reset/invalidation path. Reusing the same pseudo_costs_t for a later solve or a different cut/slack configuration can replay stale device vectors and warm-start scalars into PDLP.

Suggested direction
 template <typename i_t, typename f_t>
 struct batch_pdlp_warm_cache_t {
   const raft::handle_t batch_pdlp_handle{};
   rmm::device_uvector<f_t> initial_primal{0, batch_pdlp_handle.get_stream()};
   rmm::device_uvector<f_t> initial_dual{0, batch_pdlp_handle.get_stream()};
   f_t step_size{std::numeric_limits<f_t>::signaling_NaN()};
   f_t primal_weight{std::numeric_limits<f_t>::signaling_NaN()};
   i_t pdlp_iteration{-1};
   f_t pourcent_solved_by_batch_pdlp_at_root{f_t(0.0)};
   bool populated{false};
+
+  void reset()
+  {
+    initial_primal.resize(0, batch_pdlp_handle.get_stream());
+    initial_dual.resize(0, batch_pdlp_handle.get_stream());
+    step_size = std::numeric_limits<f_t>::signaling_NaN();
+    primal_weight = std::numeric_limits<f_t>::signaling_NaN();
+    pdlp_iteration = -1;
+    pourcent_solved_by_batch_pdlp_at_root = f_t(0.0);
+    populated = false;
+  }
 };

Call reset() from the solve-start path that invalidates the LP/cut context.

As per coding guidelines, algorithm state must be reset when transitioning between solver phases.

Also applies to: 535-535

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/pseudo_costs.hpp` around lines 411 - 420, The PDLP
warm-start cache (batch_pdlp_warm_cache_t::populated and its
device_uvector/scalar fields) is never invalidated, causing stale warm-start
data to persist across different solves; add a reset() method on
batch_pdlp_warm_cache_t that clears or reinitializes initial_primal and
initial_dual (release or resize to 0 on the RAFT/RMM stream), sets
populated=false, and resets
step_size/primal_weight/pdlp_iteration/pourcent_solved_by_batch_pdlp_at_root to
NaN/initial values, then ensure that pdlp_warm_cache.reset() is invoked from the
solver’s solve-start path (e.g., inside pseudo_costs_t::reset or the method that
invalidates LP/cut context) so the cache is cleared whenever the LP/cut
configuration changes.

@chris-maes chris-maes changed the title Race strong branching realibility branching Race batch PDLP and dual simplex in strong branching realibility branching Mar 25, 2026
@chris-maes chris-maes changed the title Race batch PDLP and dual simplex in strong branching realibility branching Race batch PDLP and dual simplex in strong branching / reliability branching Mar 25, 2026
@anandhkb anandhkb added this to the 26.06 milestone Mar 26, 2026
@anandhkb
Copy link
Copy Markdown
Contributor

I think this targets the next 26.06 release. @chris-maes @Kh4ster for advise.

@Kh4ster
Copy link
Copy Markdown
Contributor Author

Kh4ster commented Mar 30, 2026

/ok to test f456c71

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cpp/src/pdlp/solve.cu (1)

843-896: ⚠️ Potential issue | 🟠 Major

Honor generate_batch_primal_dual_solution in the memory estimate.

The estimator still budgets three full batch solution buffers even when solution collection is disabled. That erases the intended memory win and can force needless sub-batching or a spurious NumericalError on workloads that would actually fit.

Suggested fix
-static double batch_pdlp_memory_estimator(const optimization_problem_t<i_t, f_t>& problem,
-                                          double trial_batch_size)
+static double batch_pdlp_memory_estimator(const optimization_problem_t<i_t, f_t>& problem,
+                                          double trial_batch_size,
+                                          bool include_solution_buffers)
 {
   double total_memory = 0.0;
@@
-  total_memory += problem.get_n_variables() * trial_batch_size * sizeof(f_t);
-  total_memory += problem.get_n_constraints() * trial_batch_size * sizeof(f_t);
-  total_memory += problem.get_n_variables() * trial_batch_size * sizeof(f_t);
+  if (include_solution_buffers) {
+    total_memory += problem.get_n_variables() * trial_batch_size * sizeof(f_t);
+    total_memory += problem.get_n_constraints() * trial_batch_size * sizeof(f_t);
+    total_memory += problem.get_n_variables() * trial_batch_size * sizeof(f_t);
+  }
@@
-  const double memory_estimate = batch_pdlp_memory_estimator(problem, max_batch_size);
+  const double memory_estimate = batch_pdlp_memory_estimator(
+    problem, max_batch_size, settings.generate_batch_primal_dual_solution);
@@
-      const double memory_estimate = batch_pdlp_memory_estimator(problem, memory_max_batch_size);
+      const double memory_estimate = batch_pdlp_memory_estimator(
+        problem, memory_max_batch_size, settings.generate_batch_primal_dual_solution);
@@
-    const double min_estimate = batch_pdlp_memory_estimator(problem, memory_max_batch_size);
+    const double min_estimate = batch_pdlp_memory_estimator(
+      problem, memory_max_batch_size, settings.generate_batch_primal_dual_solution);
As per coding guidelines, "Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/pdlp/solve.cu` around lines 843 - 896, The memory estimator
batch_pdlp_memory_estimator currently always allocates three full batch solution
buffers; modify it to check the generate_batch_primal_dual_solution flag (or
equivalent configuration) and only include the per-batch solution buffers (the
three additions near "Data for the solution" that use problem.get_n_variables()
* trial_batch_size and problem.get_n_constraints() * trial_batch_size) when
solution collection is enabled. Update the logic in batch_pdlp_memory_estimator
to conditionally add those solution-related memory contributions based on
generate_batch_primal_dual_solution (or the solver option used to control
solution generation) so the estimator reflects the true memory needs when
solution collection is disabled.
♻️ Duplicate comments (1)
cpp/src/branch_and_bound/pseudo_costs.cpp (1)

994-997: ⚠️ Potential issue | 🟠 Major

Mirror the sub-MIP disable guard in reliability branching.

Root strong branching disables batch PDLP when settings.sub_mip is true (line 445), but this reliability branching path doesn't include the same guard. Nested sub-MIP reliability branching can still start the batch PDLP thread and reuse the warm cache, which the PR description indicates should be disabled.

Suggested fix
-  const bool use_pdlp = (rb_mode != 0) && pdlp_warm_cache.populated &&
+  const bool use_pdlp = !settings.sub_mip &&
+                        (rb_mode != 0) && pdlp_warm_cache.populated &&
                         unreliable_list.size() > min_num_candidates_for_pdlp &&
                         pdlp_warm_cache.pourcent_solved_by_batch_pdlp_at_root >
                           min_pourcent_solved_by_batch_pdlp_at_root_for_pdlp;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 994 - 997, The
reliability-branching condition that computes use_pdlp should mirror the root
strong-branching guard by disabling batch PDLP when settings.sub_mip is true;
update the boolean expression that defines use_pdlp (currently using rb_mode,
pdlp_warm_cache, unreliable_list, min_num_candidates_for_pdlp,
pdlp_warm_cache.pourcent_solved_by_batch_pdlp_at_root, and
min_pourcent_solved_by_batch_pdlp_at_root_for_pdlp) to include a
!settings.sub_mip check so nested sub-MIP reliability branching cannot start or
reuse the batch PDLP warm cache.
🧹 Nitpick comments (4)
cpp/src/branch_and_bound/pseudo_costs.cpp (4)

478-481: Track the commented-out TODO for dual solution uncrushing.

The uncrush_dual_solution call is commented out with a TODO referencing "Chris". This may affect warm-start quality if dual solution components (root_soln_y, root_soln_z) are needed for PDLP initialization.

Consider adding a tracking issue or ensuring this is addressed before the PR exits draft status.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 478 - 481, The call
to uncrush_dual_solution(...) is currently commented out, risking degraded
warm-starts because root_soln_y/root_soln_z (and
original_root_soln_y/original_root_soln_z) won't be restored; either re-enable
the call in the PDLP initialization path (using original_problem, original_lp,
root_soln_y, root_soln_z, original_root_soln_y, original_root_soln_z) or add a
tracked TODO/issue referencing this exact block so it must be resolved before
leaving draft status; ensure any enabling is guarded by a feature flag or a
clear comment linking the issue/owner (Chris) so reviewers can verify the
follow-up.

1167-1168: Consider using candidate_vars[i] consistently.

The taskloop uses unreliable_list[i] while the post-PDLP merge loop uses candidate_vars[i]. Since candidate_vars is defined as the first num_candidates elements of unreliable_list, they're functionally equivalent, but using candidate_vars[i] consistently would improve readability and reduce the risk of bugs if the logic changes.

-    for (i_t i = 0; i < num_candidates; ++i) {
-      const i_t j = unreliable_list[i];
+    for (i_t i = 0; i < num_candidates; ++i) {
+      const i_t j = candidate_vars[i];
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 1167 - 1168, The loop
currently iterates using unreliable_list[i] while later code uses
candidate_vars[i]; since candidate_vars is defined as the first num_candidates
of unreliable_list, change the loop to index candidate_vars[i] for consistency
and readability: replace uses of unreliable_list[i] inside the for-loop (and any
related variables derived there) with candidate_vars[i] so the same symbol is
used everywhere (e.g., in the taskloop and the post-PDLP merge loop) to avoid
future mismatches between unreliable_list and candidate_vars.

366-374: Consider epsilon comparison for sigma value checks.

The code uses exact floating-point equality (sigma == -1, sigma == 1) to determine constraint bound computation. While sigma values should be exactly ±1 from matrix construction, exact float comparisons can fail silently if numerical perturbations occur.

More critically, if sigma is neither exactly 1 nor exactly -1, the else branch triggers an assert in debug mode but leaves constraint_lower[i] and constraint_upper[i] uninitialized in release builds.

Suggested defensive fix
-    if (sigma == -1) {
+    if (sigma < 0) {  // sigma == -1
       constraint_lower[i] = slack_lower + lp.rhs[i];
       constraint_upper[i] = slack_upper + lp.rhs[i];
-    } else if (sigma == 1) {
+    } else {  // sigma == +1
       constraint_lower[i] = -slack_upper + lp.rhs[i];
       constraint_upper[i] = -slack_lower + lp.rhs[i];
-    } else {
-      assert(sigma == 1.0 || sigma == -1.0);
     }
+    assert(std::abs(std::abs(sigma) - 1.0) < 1e-9 && "sigma must be ±1");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 366 - 374, The code
currently compares sigma to exact values (-1 and 1) and leaves
constraint_lower[i] and constraint_upper[i] uninitialized if sigma is slightly
off, so change the branch to use an epsilon check (e.g., fabs(sigma - 1.0) < eps
and fabs(sigma + 1.0) < eps) to detect ±1; compute bounds using the existing
formulas when matched; otherwise set constraint_lower[i] and constraint_upper[i]
to a safe fallback (for example compute using sign = (sigma >= 0 ? 1 : -1) or
clamp sigma to its sign) and still trigger the assert/log for unexpected values
so release builds get initialized while debug builds still catch the anomaly;
reference symbols to update: sigma, constraint_lower, constraint_upper,
slack_lower, slack_upper, lp.rhs in pseudo_costs.cpp.

1272-1287: Clean up or track commented-out debug logging.

This block of commented-out logging code with "TODO put back" should either be removed, converted to a proper TODO with issue tracking, or re-enabled. Leaving large commented blocks reduces code readability.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 1272 - 1287, This
large commented-out logging block around rb_mode and log.printf (referencing
num_candidates, ds_optimal, ds_infeasible, ds_failed, ds_skipped, ds_elapsed)
should not remain in place as-is: either remove it entirely, or restore it by
uncommenting and fixing formatting so the if (rb_mode != 2) { ... } branch
compiles and uses the existing variables, or replace the block with a single
concise TODO comment that references a tracker/issue ID; pick one approach and
apply it consistently so the file no longer contains the large commented-out
printf block.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp`:
- Around line 287-292: The default for generate_batch_primal_dual_solution is
currently false which breaks existing batch LP callers that expect
primal/dual/reduced-cost buffers; change the default for the
generate_batch_primal_dual_solution member in solver_settings.hpp back to true
so batch solves continue to populate those buffers by default, and update the
inline comment to reflect that the default preserves previous behavior (callers
that want to opt-out can still set generate_batch_primal_dual_solution=false or
use the existing strong/reliability-branching call sites to disable collection).

In `@cpp/src/branch_and_bound/pseudo_costs.cpp`:
- Around line 158-172: Fix the typo in the comment above the DS check: change
"iteartion" to "iteration" in the comment that precedes the conditional where
sb_view.is_valid() and the ds_is_valid_done checks (the block that calls
sb_view.mark_solved and logs via settings.log.printf), so the comment reads "We
could not mark as solved nodes hitting iteration limit in DS".

In `@cpp/src/branch_and_bound/shared_strong_branching_context.hpp`:
- Around line 37-43: Add runtime bounds validation before any access to the
shared coordination state: ensure that settings.shared_sb_view.solved.size() (or
the span size exposed by shared_strong_branching_context_view_t) is >=
max_batch_size (where max_batch_size = settings.new_bounds.size()) before
entering run_batch_pdlp()'s batch loop; also add guard checks in the methods
is_solved(i_t), mark_solved(i_t) and subview(offset,count) to validate their
index/offset+count against the underlying solved span size and handle/report
invalid accesses (e.g., return false/log/throw) to prevent out-of-bounds access
and corrupted coordination state.

In `@cpp/src/pdlp/solve.cu`:
- Around line 961-965: The code allows settings.sub_batch_size to
unconditionally overwrite optimal_batch_size and can exceed the memory-checked
ceiling (memory_max_batch_size), risking GPU overcommit; update the logic around
optimal_batch_size (used with detail::optimal_batch_size_handler and compared to
max_batch_size) to revalidate or clamp settings.sub_batch_size against the
computed ceiling (memory_max_batch_size or max_batch_size) before assigning, and
adjust the subsequent cuopt_assert to reflect the validated/clamped value so an
oversized manual override cannot bypass the memory check.

In `@cpp/tests/linear_programming/pdlp_test.cu`:
- Around line 2244-2255: The test uses a fixed sleep
(std::this_thread::sleep_for(... 200ms)) to wait for PDLP to start, which is
flaky; replace this timing-based wait with deterministic synchronization: add a
latch/condition (e.g., std::promise/std::future or std::condition_variable +
mutex) or a std::barrier that the PDLP thread signals when it has reached the
point where external marks should be applied, then have the main test wait on
that synchronization primitive before setting ctx.solved[i].store(1); update the
lambda that creates pdlp_thread (pdlp_thread, solve_lp, and the PDLP entry
point) to signal the latch at the proper startup point so
shared_sb_view_concurrent_mark and shared_sb_view_all_infeasible
deterministically exercise the ConcurrentLimit race path.

---

Outside diff comments:
In `@cpp/src/pdlp/solve.cu`:
- Around line 843-896: The memory estimator batch_pdlp_memory_estimator
currently always allocates three full batch solution buffers; modify it to check
the generate_batch_primal_dual_solution flag (or equivalent configuration) and
only include the per-batch solution buffers (the three additions near "Data for
the solution" that use problem.get_n_variables() * trial_batch_size and
problem.get_n_constraints() * trial_batch_size) when solution collection is
enabled. Update the logic in batch_pdlp_memory_estimator to conditionally add
those solution-related memory contributions based on
generate_batch_primal_dual_solution (or the solver option used to control
solution generation) so the estimator reflects the true memory needs when
solution collection is disabled.

---

Duplicate comments:
In `@cpp/src/branch_and_bound/pseudo_costs.cpp`:
- Around line 994-997: The reliability-branching condition that computes
use_pdlp should mirror the root strong-branching guard by disabling batch PDLP
when settings.sub_mip is true; update the boolean expression that defines
use_pdlp (currently using rb_mode, pdlp_warm_cache, unreliable_list,
min_num_candidates_for_pdlp,
pdlp_warm_cache.pourcent_solved_by_batch_pdlp_at_root, and
min_pourcent_solved_by_batch_pdlp_at_root_for_pdlp) to include a
!settings.sub_mip check so nested sub-MIP reliability branching cannot start or
reuse the batch PDLP warm cache.

---

Nitpick comments:
In `@cpp/src/branch_and_bound/pseudo_costs.cpp`:
- Around line 478-481: The call to uncrush_dual_solution(...) is currently
commented out, risking degraded warm-starts because root_soln_y/root_soln_z (and
original_root_soln_y/original_root_soln_z) won't be restored; either re-enable
the call in the PDLP initialization path (using original_problem, original_lp,
root_soln_y, root_soln_z, original_root_soln_y, original_root_soln_z) or add a
tracked TODO/issue referencing this exact block so it must be resolved before
leaving draft status; ensure any enabling is guarded by a feature flag or a
clear comment linking the issue/owner (Chris) so reviewers can verify the
follow-up.
- Around line 1167-1168: The loop currently iterates using unreliable_list[i]
while later code uses candidate_vars[i]; since candidate_vars is defined as the
first num_candidates of unreliable_list, change the loop to index
candidate_vars[i] for consistency and readability: replace uses of
unreliable_list[i] inside the for-loop (and any related variables derived there)
with candidate_vars[i] so the same symbol is used everywhere (e.g., in the
taskloop and the post-PDLP merge loop) to avoid future mismatches between
unreliable_list and candidate_vars.
- Around line 366-374: The code currently compares sigma to exact values (-1 and
1) and leaves constraint_lower[i] and constraint_upper[i] uninitialized if sigma
is slightly off, so change the branch to use an epsilon check (e.g., fabs(sigma
- 1.0) < eps and fabs(sigma + 1.0) < eps) to detect ±1; compute bounds using the
existing formulas when matched; otherwise set constraint_lower[i] and
constraint_upper[i] to a safe fallback (for example compute using sign = (sigma
>= 0 ? 1 : -1) or clamp sigma to its sign) and still trigger the assert/log for
unexpected values so release builds get initialized while debug builds still
catch the anomaly; reference symbols to update: sigma, constraint_lower,
constraint_upper, slack_lower, slack_upper, lp.rhs in pseudo_costs.cpp.
- Around line 1272-1287: This large commented-out logging block around rb_mode
and log.printf (referencing num_candidates, ds_optimal, ds_infeasible,
ds_failed, ds_skipped, ds_elapsed) should not remain in place as-is: either
remove it entirely, or restore it by uncommenting and fixing formatting so the
if (rb_mode != 2) { ... } branch compiles and uses the existing variables, or
replace the block with a single concise TODO comment that references a
tracker/issue ID; pick one approach and apply it consistently so the file no
longer contains the large commented-out printf block.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1a11857e-6bdc-4a3c-a045-386b504556d7

📥 Commits

Reviewing files that changed from the base of the PR and between b0061e4 and f504a75.

📒 Files selected for processing (15)
  • cpp/include/cuopt/linear_programming/constants.h
  • cpp/include/cuopt/linear_programming/mip/solver_settings.hpp
  • cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp
  • cpp/src/branch_and_bound/pseudo_costs.cpp
  • cpp/src/branch_and_bound/shared_strong_branching_context.hpp
  • cpp/src/dual_simplex/simplex_solver_settings.hpp
  • cpp/src/math_optimization/solver_settings.cu
  • cpp/src/pdlp/pdlp.cu
  • cpp/src/pdlp/pdlp_constants.hpp
  • cpp/src/pdlp/solve.cu
  • cpp/src/pdlp/solver_settings.cu
  • cpp/src/pdlp/termination_strategy/termination_strategy.cu
  • cpp/src/pdlp/utilities/ping_pong_graph.cu
  • cpp/tests/linear_programming/pdlp_test.cu
  • python/cuopt_server/cuopt_server/utils/linear_programming/data_definition.py
💤 Files with no reviewable changes (1)
  • cpp/src/pdlp/pdlp_constants.hpp
✅ Files skipped from review due to trivial changes (1)
  • cpp/src/pdlp/utilities/ping_pong_graph.cu
🚧 Files skipped from review as they are similar to previous changes (6)
  • cpp/include/cuopt/linear_programming/constants.h
  • cpp/src/dual_simplex/simplex_solver_settings.hpp
  • cpp/src/math_optimization/solver_settings.cu
  • python/cuopt_server/cuopt_server/utils/linear_programming/data_definition.py
  • cpp/src/pdlp/solver_settings.cu
  • cpp/src/pdlp/pdlp.cu

Comment on lines +287 to +292
// By default to save memory and speed we don't store and copy each climber's primal and dual
// solutions We only retrieve termination statistics and the objective values
bool generate_batch_primal_dual_solution{false};
// Used to force batch PDLP to solve a subbatch of the problems at a time
// The 0 default value will make the solver use its heuristic to determine the subbatch size
i_t sub_batch_size{0};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Keep batch solution generation opt-out, not opt-in.

With this default at false, generic batch LP solves now return empty primal/dual/reduced-cost buffers unless every caller remembers to opt in. Existing batch tests in cpp/tests/linear_programming/pdlp_test.cu still slice those buffers after solve_lp(..., new_bounds ...), so this silently regresses the non-MIP batch path and can drive helpers like extract_subvector() past a zero-length buffer. The strong/reliability-branching call sites can disable collection explicitly, but the default here needs to preserve the old behavior.

Suggested fix
-  bool generate_batch_primal_dual_solution{false};
+  bool generate_batch_primal_dual_solution{true};
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// By default to save memory and speed we don't store and copy each climber's primal and dual
// solutions We only retrieve termination statistics and the objective values
bool generate_batch_primal_dual_solution{false};
// Used to force batch PDLP to solve a subbatch of the problems at a time
// The 0 default value will make the solver use its heuristic to determine the subbatch size
i_t sub_batch_size{0};
// By default to save memory and speed we don't store and copy each climber's primal and dual
// solutions We only retrieve termination statistics and the objective values
bool generate_batch_primal_dual_solution{true};
// Used to force batch PDLP to solve a subbatch of the problems at a time
// The 0 default value will make the solver use its heuristic to determine the subbatch size
i_t sub_batch_size{0};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp` around lines
287 - 292, The default for generate_batch_primal_dual_solution is currently
false which breaks existing batch LP callers that expect
primal/dual/reduced-cost buffers; change the default for the
generate_batch_primal_dual_solution member in solver_settings.hpp back to true
so batch solves continue to populate those buffers by default, and update the
inline comment to reflect that the default preserves previous behavior (callers
that want to opt-out can still set generate_batch_primal_dual_solution=false or
use the existing strong/reliability-branching call sites to disable collection).

Comment on lines +37 to +43
bool is_solved(i_t local_idx) const { return solved[local_idx].load() != 0; }

void mark_solved(i_t local_idx) const { solved[local_idx].store(1); }

shared_strong_branching_context_view_t subview(i_t offset, i_t count) const
{
return {solved.subspan(offset, count)};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate and examine the header file in question
find . -name "shared_strong_branching_context.hpp" -type f

Repository: NVIDIA/cuopt

Length of output: 118


🏁 Script executed:

# Examine the full context of the shared_strong_branching_context.hpp file
cat -n cpp/src/branch_and_bound/shared_strong_branching_context.hpp

Repository: NVIDIA/cuopt

Length of output: 1636


🏁 Script executed:

# Search for usages of subview, is_solved, and mark_solved in solve.cu
rg "subview|is_solved|mark_solved" cpp/src/pdlp/solve.cu -A 3 -B 3

Repository: NVIDIA/cuopt

Length of output: 347


🏁 Script executed:

# Check how the shared_strong_branching_context is initialized and used
rg "shared_strong_branching_context" cpp/src/pdlp/solve.cu -A 2 -B 2

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

# Get broader context around the subview call in solve.cu
rg "shared_sb_view.subview" cpp/src/pdlp/solve.cu -B 10 -A 5

Repository: NVIDIA/cuopt

Length of output: 744


🏁 Script executed:

# Look for the loop that iterates through batches and understand i and current_batch_size
rg "for.*i.*current_batch_size|while.*current_batch_size" cpp/src/pdlp/solve.cu -B 5 -A 10

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

# Search for any initialization or setup of shared_sb_view to understand its size
rg "shared_sb_view\s*=" cpp/src/pdlp/solve.cu -B 5 -A 3

Repository: NVIDIA/cuopt

Length of output: 484


🏁 Script executed:

# Check if there's any size validation before the batch loop
rg "solve.cu" -l | head -1 | xargs wc -l

Repository: NVIDIA/cuopt

Length of output: 102


🏁 Script executed:

# Search for where run_batch_pdlp is called and how shared_sb_view is initialized
rg "run_batch_pdlp|shared_strong_branching_context_t" cpp/src/pdlp/ -B 3 -A 3 | head -50

Repository: NVIDIA/cuopt

Length of output: 849


🏁 Script executed:

# Check how settings.shared_sb_view is created/passed in
rg "shared_sb_view\s*=" cpp/src/pdlp/ --type cpp --type cu -B 2 -A 2

Repository: NVIDIA/cuopt

Length of output: 82


🏁 Script executed:

# Look for how num_subproblems and max_batch_size are related
rg "num_subproblems|max_batch_size" cpp/src/pdlp/solve.cu -B 2 -A 2

Repository: NVIDIA/cuopt

Length of output: 2609


🏁 Script executed:

# Search for where shared_strong_branching_context_t or shared_sb_view is created/initialized
rg "shared_strong_branching_context_t|shared_sb_view" cpp/src/pdlp/ --type cpp -B 5 -A 5 | head -80

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

# Look at the full solve.cu function signature and understand the settings parameter
rg -A 30 "template.*run_batch_pdlp" cpp/src/pdlp/solve.cu | head -40

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

# Check if there are any asserts or validations about shared_sb_view size before the loop
rg "shared_sb_view" cpp/src/pdlp/solve.cu -B 20 | grep -A 20 "run_batch_pdlp"

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

# Let's search more broadly in the entire codebase for shared_strong_branching_context
find . -type f \( -name "*.cpp" -o -name "*.hpp" -o -name "*.cu" -o -name "*.cuh" \) -exec grep -l "shared_strong_branching_context" {} \;

Repository: NVIDIA/cuopt

Length of output: 270


🏁 Script executed:

# Check the test file to understand how this structure is used
rg "shared_strong_branching_context" cpp/tests/ -B 3 -A 3

Repository: NVIDIA/cuopt

Length of output: 5269


🏁 Script executed:

# Get the signature and first part of run_batch_pdlp to understand all parameters
sed -n '1,100p' cpp/src/pdlp/solve.cu | rg "run_batch_pdlp" -B 5 -A 25

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

# Look for any validation of shared_sb_view before it's used
rg "is_valid|shared_sb_view" cpp/src/pdlp/solve.cu -B 2 -A 2

Repository: NVIDIA/cuopt

Length of output: 297


🏁 Script executed:

# Verify the exact loop bounds and how max_batch_size relates to context initialization
sed -n '1,200p' cpp/src/pdlp/solve.cu | grep -A 50 "for (size_t i = 0; i < max_batch_size"

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

# Check solver_settings.hpp to see how shared_sb_view is defined and documented
rg "shared_sb_view" cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp -B 3 -A 3

Repository: NVIDIA/cuopt

Length of output: 496


🏁 Script executed:

# Check if there's any documentation or assertion in run_batch_pdlp about shared_sb_view size requirement
sed -n '1,300p' cpp/src/pdlp/solve.cu | tail -100

Repository: NVIDIA/cuopt

Length of output: 7391


🏁 Script executed:

# Search for any size validation of shared_sb_view in the entire codebase
rg "shared_sb_view.*size|solved.*size" cpp/ -B 2 -A 2

Repository: NVIDIA/cuopt

Length of output: 2476


🏁 Script executed:

# Look at how the batch_pdlp_solve function calls run_batch_pdlp to see if there's validation there
rg "batch_pdlp_solve|run_batch_pdlp" cpp/src/pdlp/solve.cu -B 5 -A 15 | head -60

Repository: NVIDIA/cuopt

Length of output: 2350


🏁 Script executed:

# Look for any assertions or documentation about the size relationship
rg "shared_sb_view|max_batch_size" cpp/src/pdlp/solve.cu | grep -i "assert\|check\|expect\|size"

Repository: NVIDIA/cuopt

Length of output: 1388


🏁 Script executed:

# Check pseudo_costs.cpp to see if there are any additional clues about expected usage
rg "shared_strong_branching_context" cpp/src/branch_and_bound/pseudo_costs.cpp -B 5 -A 5

Repository: NVIDIA/cuopt

Length of output: 2294


🏁 Script executed:

# Verify std::span behavior by checking if there's any documentation in the code
rg "span.subspan|span\[" cpp/src/branch_and_bound/shared_strong_branching_context.hpp -B 2 -A 2

Repository: NVIDIA/cuopt

Length of output: 38


Add bounds validation before accessing the shared context view.

is_solved, mark_solved, and subview directly index/subspan the shared coordination state without runtime bounds checks. The is_valid() guard only detects empty spans, not size mismatches. When run_batch_pdlp() iterates with indices derived from max_batch_size = settings.new_bounds.size(), it must validate that settings.shared_sb_view.solved.size() >= max_batch_size before the batch loop begins. Otherwise, mismatched context sizes will cause out-of-bounds access and corrupt coordination state.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/shared_strong_branching_context.hpp` around lines 37
- 43, Add runtime bounds validation before any access to the shared coordination
state: ensure that settings.shared_sb_view.solved.size() (or the span size
exposed by shared_strong_branching_context_view_t) is >= max_batch_size (where
max_batch_size = settings.new_bounds.size()) before entering run_batch_pdlp()'s
batch loop; also add guard checks in the methods is_solved(i_t),
mark_solved(i_t) and subview(offset,count) to validate their index/offset+count
against the underlying solved span size and handle/report invalid accesses
(e.g., return false/log/throw) to prevent out-of-bounds access and corrupted
coordination state.

Comment on lines +961 to 965
size_t optimal_batch_size = use_optimal_batch_size
? detail::optimal_batch_size_handler(problem, memory_max_batch_size)
: max_batch_size;
if (settings.sub_batch_size > 0) { optimal_batch_size = settings.sub_batch_size; }
cuopt_assert(optimal_batch_size != 0 && optimal_batch_size <= max_batch_size,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Revalidate manual sub-batch overrides against the fitted ceiling.

settings.sub_batch_size overwrites the memory-checked value unconditionally. If the full batch does not fit and the caller forces a larger sub-batch than memory_max_batch_size, this path can still overcommit GPU memory and undo the protection above.

Suggested fix
   size_t optimal_batch_size = use_optimal_batch_size
                                 ? detail::optimal_batch_size_handler(problem, memory_max_batch_size)
                                 : max_batch_size;
-  if (settings.sub_batch_size > 0) { optimal_batch_size = settings.sub_batch_size; }
+  if (settings.sub_batch_size > 0) {
+    const auto requested_sub_batch_size = static_cast<size_t>(settings.sub_batch_size);
+    cuopt_assert(requested_sub_batch_size <= memory_max_batch_size,
+                 "sub_batch_size exceeds the memory-checked batch size");
+    optimal_batch_size = requested_sub_batch_size;
+  }
As per coding guidelines, "Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems".
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
size_t optimal_batch_size = use_optimal_batch_size
? detail::optimal_batch_size_handler(problem, memory_max_batch_size)
: max_batch_size;
if (settings.sub_batch_size > 0) { optimal_batch_size = settings.sub_batch_size; }
cuopt_assert(optimal_batch_size != 0 && optimal_batch_size <= max_batch_size,
size_t optimal_batch_size = use_optimal_batch_size
? detail::optimal_batch_size_handler(problem, memory_max_batch_size)
: max_batch_size;
if (settings.sub_batch_size > 0) {
const auto requested_sub_batch_size = static_cast<size_t>(settings.sub_batch_size);
cuopt_assert(requested_sub_batch_size <= memory_max_batch_size,
"sub_batch_size exceeds the memory-checked batch size");
optimal_batch_size = requested_sub_batch_size;
}
cuopt_assert(optimal_batch_size != 0 && optimal_batch_size <= max_batch_size,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/pdlp/solve.cu` around lines 961 - 965, The code allows
settings.sub_batch_size to unconditionally overwrite optimal_batch_size and can
exceed the memory-checked ceiling (memory_max_batch_size), risking GPU
overcommit; update the logic around optimal_batch_size (used with
detail::optimal_batch_size_handler and compared to max_batch_size) to revalidate
or clamp settings.sub_batch_size against the computed ceiling
(memory_max_batch_size or max_batch_size) before assigning, and adjust the
subsequent cuopt_assert to reflect the validated/clamped value so an oversized
manual override cannot bypass the memory check.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
cpp/include/cuopt/linear_programming/mip/solver_settings.hpp (1)

101-104: Consider replacing 0/1/2 mode literals with shared named constants (or enum).

Using a shared enum/constexpr for these mode values would reduce semantic drift between comments, registration bounds, and consumers.

♻️ Example refactor (local to this header)
+  enum class batch_pdlp_mode_t : i_t {
+    dual_simplex_only = 0,
+    cooperative       = 1,
+    batch_pdlp_only   = 2
+  };
+
-  i_t mip_batch_pdlp_strong_branching{
-    1};  // 0 = DS only, 1 = cooperative DS + PDLP, 2 = batch PDLP only
-  i_t mip_batch_pdlp_reliability_branching{
-    1};  // 0 = DS only, 1 = cooperative DS + PDLP, 2 = batch PDLP only
+  i_t mip_batch_pdlp_strong_branching{
+    static_cast<i_t>(batch_pdlp_mode_t::cooperative)};
+  i_t mip_batch_pdlp_reliability_branching{
+    static_cast<i_t>(batch_pdlp_mode_t::cooperative)};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/include/cuopt/linear_programming/mip/solver_settings.hpp` around lines
101 - 104, Replace the numeric mode literals for mip_batch_pdlp_strong_branching
and mip_batch_pdlp_reliability_branching with a shared named enum or constexpr
constants (e.g., Mode::DS_ONLY, Mode::COOPERATIVE_DS_PDLP,
Mode::BATCH_PDLP_ONLY) declared in this header; update the variable
types/initializers to use those named constants, update the comments to refer to
the enum names, and ensure any registration/validation logic that currently
checks 0/1/2 uses the enum values so bounds and consumers remain consistent with
the new symbols.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@cpp/include/cuopt/linear_programming/mip/solver_settings.hpp`:
- Around line 101-104: Replace the numeric mode literals for
mip_batch_pdlp_strong_branching and mip_batch_pdlp_reliability_branching with a
shared named enum or constexpr constants (e.g., Mode::DS_ONLY,
Mode::COOPERATIVE_DS_PDLP, Mode::BATCH_PDLP_ONLY) declared in this header;
update the variable types/initializers to use those named constants, update the
comments to refer to the enum names, and ensure any registration/validation
logic that currently checks 0/1/2 uses the enum values so bounds and consumers
remain consistent with the new symbols.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ea0863f1-03fa-448e-a445-e8836db6e514

📥 Commits

Reviewing files that changed from the base of the PR and between f504a75 and f456c71.

📒 Files selected for processing (5)
  • cpp/include/cuopt/linear_programming/constants.h
  • cpp/include/cuopt/linear_programming/mip/solver_settings.hpp
  • cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp
  • cpp/src/math_optimization/solver_settings.cu
  • cpp/src/pdlp/solve.cu
✅ Files skipped from review due to trivial changes (1)
  • cpp/src/pdlp/solve.cu
🚧 Files skipped from review as they are similar to previous changes (3)
  • cpp/include/cuopt/linear_programming/constants.h
  • cpp/src/math_optimization/solver_settings.cu
  • cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp

@Kh4ster
Copy link
Copy Markdown
Contributor Author

Kh4ster commented Mar 30, 2026

/ok to test 496c4fd

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
cpp/src/branch_and_bound/pseudo_costs.cpp (1)

158-172: ⚠️ Potential issue | 🟡 Minor

Minor: Typo in comment.

Line 160 has "iteartion" instead of "iteration".

-      // We could not mark as solved nodes hitting iteartion limit in DS
+      // We could not mark as solved nodes hitting iteration limit in DS
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 158 - 172, Fix the
simple typo in the comment inside the block guarded by sb_view.is_valid() in
pseudo_costs.cpp: change "iteartion" to "iteration" in the comment that reads
"We could not mark as solved nodes hitting iteartion limit in DS"; the
surrounding code uses sb_view.mark_solved(shared_idx) and
ds_is_valid_done(ds_status_down[k]) / ds_is_valid_done(ds_status_up[k]) so
update only the comment text to "iteration" to correct the spelling.
🧹 Nitpick comments (3)
cpp/src/branch_and_bound/pseudo_costs.cpp (3)

502-505: Redundant assert before identical if-check.

The assert on line 502 checks the same condition as the if-check on line 505. Since the code inside the if-block is the only code path anyway, the assert is redundant.

-    assert(!pc.pdlp_warm_cache.populated &&
-           "PDLP warm cache should not be populated at this point");
-
-    if (!pc.pdlp_warm_cache.populated) {
+    if (!pc.pdlp_warm_cache.populated) {
+      // Expected: cache is not populated on first entry
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 502 - 505, Remove the
redundant assert that checks pc.pdlp_warm_cache.populated immediately before the
identical if-check; specifically delete the assert(!pc.pdlp_warm_cache.populated
&& "PDLP warm cache should not be populated at this point") so only the existing
if (!pc.pdlp_warm_cache.populated) { ... } remains (leave the if-block and its
logic unchanged).

1283-1301: Address TODO: Remove or restore commented-out logging code.

This block of commented-out logging code should either be restored or removed before merging. If the statistics are useful, consider re-enabling them; otherwise, clean up the dead code.

Would you like me to help restore this logging block with the current variable names, or should it be removed entirely?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 1283 - 1301, Restore
the commented-out RB Dual Simplex logging: re-enable the conditional block
around rb_mode (use rb_mode != 2 check, with the inner rb_mode == 1 branch) and
call log.printf with the current variables ds_elapsed (from toc(ds_start_time)),
num_candidates, ds_optimal.load(), ds_infeasible.load(), ds_failed.load(), and
ds_skipped.load() exactly as in the diff; ensure the two format strings match
their branches (one including ds_skipped and “(PDLP)”, the other without it) and
that values use num_candidates * 2 for the denominators as shown, keeping
correct punctuation and line breaks so the code compiles.

366-373: Consider using epsilon comparison for floating-point sigma values.

Exact floating-point comparison (sigma == -1, sigma == 1) can fail if sigma has any numerical noise from matrix operations. While slack coefficients are typically constructed as exact +1/-1, an epsilon-based comparison would be more robust.

-    if (sigma == -1) {
+    constexpr f_t eps = 1e-9;
+    if (std::abs(sigma - (-1.0)) < eps) {
       constraint_lower[i] = slack_lower + lp.rhs[i];
       constraint_upper[i] = slack_upper + lp.rhs[i];
-    } else if (sigma == 1) {
+    } else if (std::abs(sigma - 1.0) < eps) {
       constraint_lower[i] = -slack_upper + lp.rhs[i];
       constraint_upper[i] = -slack_lower + lp.rhs[i];
     } else {
-      assert(sigma == 1.0 || sigma == -1.0);
+      assert(false && "sigma must be +1 or -1");
     }

As per coding guidelines: "use epsilon comparisons for floating-point equality checks"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 366 - 373, The code
compares floating-point sigma to exact integers; replace direct comparisons in
the block handling sigma with epsilon-based checks (e.g., fabs(sigma - 1.0) <
EPS and fabs(sigma + 1.0) < EPS) when assigning constraint_lower[i] and
constraint_upper[i] using slack_lower/slack_upper and lp.rhs[i]; define or reuse
a small EPS constant and update the final assert to assert(fabs(sigma - 1.0) <
EPS || fabs(sigma + 1.0) < EPS) so the logic in this function (pseudo_costs.cpp
handling sigma, constraint_lower, constraint_upper, slack_lower, slack_upper) is
robust to floating-point noise.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@cpp/src/branch_and_bound/pseudo_costs.cpp`:
- Around line 158-172: Fix the simple typo in the comment inside the block
guarded by sb_view.is_valid() in pseudo_costs.cpp: change "iteartion" to
"iteration" in the comment that reads "We could not mark as solved nodes hitting
iteartion limit in DS"; the surrounding code uses
sb_view.mark_solved(shared_idx) and ds_is_valid_done(ds_status_down[k]) /
ds_is_valid_done(ds_status_up[k]) so update only the comment text to "iteration"
to correct the spelling.

---

Nitpick comments:
In `@cpp/src/branch_and_bound/pseudo_costs.cpp`:
- Around line 502-505: Remove the redundant assert that checks
pc.pdlp_warm_cache.populated immediately before the identical if-check;
specifically delete the assert(!pc.pdlp_warm_cache.populated && "PDLP warm cache
should not be populated at this point") so only the existing if
(!pc.pdlp_warm_cache.populated) { ... } remains (leave the if-block and its
logic unchanged).
- Around line 1283-1301: Restore the commented-out RB Dual Simplex logging:
re-enable the conditional block around rb_mode (use rb_mode != 2 check, with the
inner rb_mode == 1 branch) and call log.printf with the current variables
ds_elapsed (from toc(ds_start_time)), num_candidates, ds_optimal.load(),
ds_infeasible.load(), ds_failed.load(), and ds_skipped.load() exactly as in the
diff; ensure the two format strings match their branches (one including
ds_skipped and “(PDLP)”, the other without it) and that values use
num_candidates * 2 for the denominators as shown, keeping correct punctuation
and line breaks so the code compiles.
- Around line 366-373: The code compares floating-point sigma to exact integers;
replace direct comparisons in the block handling sigma with epsilon-based checks
(e.g., fabs(sigma - 1.0) < EPS and fabs(sigma + 1.0) < EPS) when assigning
constraint_lower[i] and constraint_upper[i] using slack_lower/slack_upper and
lp.rhs[i]; define or reuse a small EPS constant and update the final assert to
assert(fabs(sigma - 1.0) < EPS || fabs(sigma + 1.0) < EPS) so the logic in this
function (pseudo_costs.cpp handling sigma, constraint_lower, constraint_upper,
slack_lower, slack_upper) is robust to floating-point noise.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c9d7be61-6bc3-4400-aa36-b21d1943fd39

📥 Commits

Reviewing files that changed from the base of the PR and between f456c71 and 962c2ea.

📒 Files selected for processing (1)
  • cpp/src/branch_and_bound/pseudo_costs.cpp

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/src/branch_and_bound/shared_strong_branching_context.hpp`:
- Around line 49-53: The bounds check in subview can overflow when adding signed
i_t values; replace the single assert(offset + count <= solved.size()) with a
safe check that prevents overflow by validating non-negativity (if i_t is
signed) and comparing sizes using the same unsigned type as solved.size():
ensure offset >= 0 and count >= 0 (if applicable), then cast to the span-size
type (e.g. std::size_t) and assert(static_cast<size_type>(offset) <=
solved.size() && static_cast<size_type>(count) <= solved.size() -
static_cast<size_type>(offset)); also use those casts when calling
solved.subspan(offset, count) so subview uses consistent unsigned indices.
- Around line 37-47: The bounds checks in is_solved and mark_solved are unsafe
if i_t is signed because a negative local_idx will convert to a large unsigned
when compared to solved.size() and assert may be disabled in release; update
both methods (is_solved and mark_solved) to first check that local_idx is
non-negative and then compare static_cast<size_t>(local_idx) against
solved.size(), and handle failures with a runtime-safe response (e.g., throw
std::out_of_range or return/abort) instead of relying solely on assert so access
to solved[local_idx] is always safe.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4f7a71dc-8774-4f65-90e4-9dd5c98b0e0a

📥 Commits

Reviewing files that changed from the base of the PR and between 962c2ea and 496c4fd.

📒 Files selected for processing (1)
  • cpp/src/branch_and_bound/shared_strong_branching_context.hpp

Comment on lines +37 to +47
bool is_solved(i_t local_idx) const
{
assert(local_idx < solved.size() && "local_idx out of bounds");
return solved[local_idx].load() != 0;
}

void mark_solved(i_t local_idx) const
{
assert(local_idx < solved.size() && "local_idx out of bounds");
solved[local_idx].store(1);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Signed/unsigned comparison hazard if i_t is signed.

If i_t is a signed type (e.g., int), a negative local_idx value will implicitly convert to a large unsigned value when compared against solved.size(), bypassing the bounds check and causing undefined behavior on access. Additionally, assert is disabled in release builds, leaving no runtime protection.

Consider adding an explicit non-negative check or casting to ensure safe comparison:

🛡️ Proposed fix
   bool is_solved(i_t local_idx) const
   {
-    assert(local_idx < solved.size() && "local_idx out of bounds");
+    assert(local_idx >= 0 && static_cast<size_t>(local_idx) < solved.size() && "local_idx out of bounds");
     return solved[local_idx].load() != 0;
   }

   void mark_solved(i_t local_idx) const
   {
-    assert(local_idx < solved.size() && "local_idx out of bounds");
+    assert(local_idx >= 0 && static_cast<size_t>(local_idx) < solved.size() && "local_idx out of bounds");
     solved[local_idx].store(1);
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/shared_strong_branching_context.hpp` around lines 37
- 47, The bounds checks in is_solved and mark_solved are unsafe if i_t is signed
because a negative local_idx will convert to a large unsigned when compared to
solved.size() and assert may be disabled in release; update both methods
(is_solved and mark_solved) to first check that local_idx is non-negative and
then compare static_cast<size_t>(local_idx) against solved.size(), and handle
failures with a runtime-safe response (e.g., throw std::out_of_range or
return/abort) instead of relying solely on assert so access to solved[local_idx]
is always safe.

Copy link
Copy Markdown
Contributor

@nguidotti nguidotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the hard work, Nicolas!

My main concerns are:

  1. Do not create a std::thread at each call to the reliability branching. Instead, use omp task to integrate with the rest of B&B, share the CPU resources more efficiently and avoid oversubscribing the CPU.

  2. Since two or more threads might call the reliability branching at the same time, some variables may appear in multiple unreliable lists. Yet, a single call to strong branching can be enough to turn the variable into reliable. If this is the case, then we can propagate the score from one thread to another rather than calling strong branching again. That is why there is a mutex per variable that locks when performing a trial branching. When using batch PDLP, we also need to consider this case and mark the variable accordingly. Multiple calls to strong branching is not necessarily a bad thing (more information about a variable can lead to better decisions). However, it may be too expensive.

  3. Considering that there are many simultaneous calls to batch PDLP, are the kernels inserted in the same CUDA stream or each call goes to a separated stream? The latter may be better for performance, but might mess up with the determinism.


namespace {

static bool ds_is_valid_done(dual::status_t status)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: ds_is_valid_done --> is_dual_simplex_done

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks I will update that

template <typename i_t, typename f_t>
void strong_branching(const user_problem_t<i_t, f_t>& original_problem,
const lp_problem_t<i_t, f_t>& original_lp,
static std::pair<f_t, i_t> merge_sb_result(f_t ds_val,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: use an enum instead of a int to set the source. Avoid setting an incorrect value in the future and the intent of the return is more clear

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

const std::vector<i_t>& new_slacks,
const std::vector<variable_type_t>& var_types,
const std::vector<f_t> root_soln,
const std::vector<f_t>& root_soln,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that in #979, I am setting the root_soln to lp_solution such that x, y and z are already available.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes if your PR gets merged first I will use it instead

const std::vector<f_t>& edge_norms,
pseudo_costs_t<i_t, f_t>& pc)
pseudo_costs_t<i_t, f_t>& pc,
std::vector<f_t>& ds_obj_down,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would avoid calling dual simplex as ds. We do not use this terminology on the other parts of the code base

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, strong_branching vs sb or reliability branching vs rb.

fractional.size());

// Cooperative DS + PDLP: shared context tracks which subproblems are solved
shared_strong_branching_context_t<i_t, f_t> shared_ctx(2 * fractional.size());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question: why do we need 2 structs here? Can you just pass a pointer or reference to shared_strong_branching_context_t instead?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original idea for the view version was to have sub_views as in BPDLP I might solve parts of the batch one after another and the sub view would abstract the initial offset

} else if (rb_mode != 0 && unreliable_list.size() < min_num_candidates_for_pdlp) {
log.printf("Not enough candidates to use batch PDLP, using DS only\n");
} else if (rb_mode != 0 && pdlp_warm_cache.pourcent_solved_by_batch_pdlp_at_root < 5.0) {
log.printf("Pourcent solved by batch PDLP at root is too low, using DS only\n");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: Pourcent --> Percent

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to check for other places as well

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch

const int num_tasks = std::max(max_num_tasks, 1);
const int task_priority = reliability_branching_settings.task_priority;
const i_t max_num_candidates = reliability_branching_settings.max_num_candidates;
const i_t rb_mode = settings.mip_batch_pdlp_reliability_branching;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to move reliability branching with batch PDLP to another routine. Keep the original with just dual simplex

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it is important to consider that we are setting the reliability threshold based on the number of dual simplex iterations in reliability branching and B&B. So we may need a way to add the PDLP iterations to the calculation as well

std::vector<f_t> pdlp_obj_up(num_candidates, std::numeric_limits<f_t>::quiet_NaN());

std::atomic<int> concurrent_halt{0};
std::thread pdlp_thread;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would avoid creating a std::thread here for three reasons:

  1. The entire B&B is using OpenMP, so it is best to use the same model throughout the entire solver.
  2. This code section is executed in parallel by a large quantity of threads, such that we are essentially duplicating the number of executing threads. This is very bad for the performance.
  3. This routine is very frequently called, such that creating and destroying a thread may lead to a significant overhead.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words, replace std::thread with omp task

@Kh4ster
Copy link
Copy Markdown
Contributor Author

Kh4ster commented Mar 30, 2026

/ok to test 6e5baa5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality mip non-breaking Introduces a non-breaking change pdlp

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants