Dump presolved problem to file#459
Conversation
|
Do you want to write both the original problem and the presolved problem out? Maybe you could append '.pre' to the filename for the presolved problem? |
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
2 similar comments
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
Moving this out of the 25.12 release. If you want to keep this @hlinsen, feel free to move it back. |
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
6 similar comments
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
@hlinsen is this going in for 26.02? |
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
1 similar comment
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
4 similar comments
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
This would be a good feature, but I'd prefer a solver setting / CLI argument linked to this (eg. --presolve-file) and left false/null by default. As a file name, we can let the user specify the filename with this argument, similar to --log-file or ---solution-file. On a related note, I see some CLI arguments listed with a brief description but not the rest. Can we add brief descriptions to the ones that are not too obvious? |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (6)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughThis pull request introduces support for exporting presolved linear programming models to MPS files. A new Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 12
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/trigger-breaking-change-alert.yaml:
- Line 18: Replace the mutable ref "@main" used in the reusable workflow
declaration "uses:
rapidsai/shared-workflows/.github/workflows/breaking-change-alert.yaml@main"
with an immutable commit SHA (or a tightly controlled tag); update the uses line
to reference the specific commit SHA for the rapidsai/shared-workflows repo so
the pulled workflow cannot change unexpectedly when running in the
pull_request_target + secrets: inherit context.
In `@cpp/src/linear_programming/solve.cu`:
- Around line 888-902: presolver->undo(...) reconstructs full-problem duals into
dual_solution and reduced_costs, but the code unconditionally overwrites them
with NaN; change this so the NaN-fill only runs when dual postsolve is disabled.
Concretely, wrap the two thrust::fill calls (the ones that write
std::numeric_limits<f_t>::signaling_NaN() to dual_solution and reduced_costs) in
a conditional that checks the dual_postsolve setting (e.g., if
(!settings.dual_postsolve) { ... }), leaving presolver->undo and the
reconstructed duals intact when settings.dual_postsolve is true.
- Around line 402-424: The barrier solver is being given the full
settings.time_limit instead of the remaining caller budget, so update
run_barrier() to set barrier_settings.time_limit from the provided timer (use
timer.remaining_time()) rather than settings.time_limit; specifically change the
assignment to barrier_settings.time_limit so the call to
dual_simplex::solve_linear_program_with_barrier uses the remaining time budget
(mirror how the dual simplex branch uses timer.remaining_time()).
- Around line 681-699: The code starts dual_simplex_thread passing
dual_simplex_problem by reference (dual_simplex_thread/run_dual_simplex_thread)
and only then copies it to barrier_problem, which races because
run_dual_simplex_thread mutates its user_problem_t&; fix by making the barrier
copy (or both copies) before launching any threads so each thread gets its own
independent user_problem_t (e.g., create barrier_problem =
cuopt_problem_to_simplex_problem<i_t,f_t>(d_barrier_problem) or copy
dual_simplex_problem into barrier_problem) and/or change thread entry to take
the problem by value rather than std::ref to dual_simplex_problem, ensuring
dual_simplex_problem, barrier_problem and the threads do not share/mutate the
same object concurrently.
- Around line 535-538: The PDLP branch incorrectly treats a zero-constraint
(bound-only) LP as a numerical error; instead, detect when problem.n_constraints
== 0 and handle it by solving the bound-only case or delegating to the
dual-simplex solver: call the existing dual simplex routine (or a lightweight
bound-only evaluator) to compute the correct termination status (e.g., OPTIMAL
or UNBOUNDED) and objective/variable values, then return an
optimization_problem_solution_t<i_t,f_t> with that termination status and
problem.handle_ptr->get_stream() rather than returning
pdlp_termination_status_t::NumericalError. Ensure any existing dual-simplex
entrypoint (or a helper that evaluates bounds) is used so behavior matches other
paths.
- Around line 317-318: The global volatile flag global_concurrent_halt is
unsafe: replace the process-wide volatile int with a per-solve atomic halt flag
(e.g., std::atomic<int> halt) stored inside the solve context/struct passed to
solve_lp() and to all worker components (PDLP, barrier, dual simplex, crossover)
so each concurrent solve uses its own flag; update all places currently
reading/writing global_concurrent_halt to read/write the context->halt (or
reference) atomically, remove the volatile global, and ensure worker threads
capture the per-solve context rather than a global pointer to avoid data races.
In `@cpp/src/mip/solve.cu`:
- Around line 123-129: The loop is mutating caller-owned warm-start buffers via
scaling.scale_primal(*initial_solution); avoid in-place scaling by making a
solver-local copy of each initial solution before calling scaling.scale_primal.
Specifically, when iterating settings.initial_solutions in the branch guarded by
settings.mip_scaling, copy the pointed-to solution (e.g., create a local
variable from *initial_solution), call scaling.scale_primal on that copy, and
use the copied/scaled object for the solver, leaving settings.initial_solutions
and the caller's buffers unchanged so repeated solves won't double-scale.
- Around line 266-278: The code recomputes feasibility on the full problem via
full_sol.compute_feasibility() and warns if infeasible, but then constructs the
returned solution using sol = full_sol.get_solution(true, full_stats) which
forces a "feasible" flag and uses stats not computed from the full solution;
instead, obtain the correct stats and feasibility from full_sol (e.g.,
full_stats = full_sol.get_stats(); bool feasible = full_sol.get_feasible()),
remove the hacky post_process_completed override, and call
full_sol.get_solution(feasible, full_stats) (or otherwise pass the actual full
problem stats) so the returned sol reflects the true postsolve feasibility and
correct stats.
- Around line 203-205: The presolve_time_limit computation allows sub-second
budgets for small time_limit values and must enforce the same 1.0s floor used in
the LP path; update the code that computes presolve_time_limit (the variable in
the MIP solve code where you currently do const double presolve_time_limit =
std::min(0.1 * time_limit, 60.0);) to clamp the result to at least 1.0 second
before calling presolver->apply(), i.e. compute the 10%/60s cap as now and then
take std::max(..., 1.0) so presolver->apply() never receives a budget < 1.0s.
- Around line 81-88: The current device lambda that assigns solution.assignment
in the empty-problem fast path can pick get_upper(bounds) for zero objective
coefficients, producing +inf and later NaNs; change the selection logic inside
the lambda (the one capturing sol = solution.assignment.data() and pb =
problem.view()) so that when pb.objective_coefficients[index] == 0 it chooses a
finite feasible value: if 0 lies within [get_lower(bounds), get_upper(bounds)]
assign 0, otherwise clamp 0 to the nearest finite bound (prefer the finite lower
or upper; if one side is infinite pick the finite side; if both are infinite
assign 0). Ensure comparisons handle infinities robustly (use finite checks) so
solution.assignment never receives inf.
In `@docs/cuopt/source/faq.rst`:
- Line 286: The external notebook link in the FAQ line referencing
`notebooks/routing/service/cost_matrix_creation.ipynb` is incorrect; replace
that URL with the correct notebook path
`notebooks/intra-factory_transport/cost_matrix_and_waypoint_graph_creation.ipynb`
in the sentence that links to the cuOpt-Resources repo (the line containing the
hyperlink text "here"); also ensure the link continues to point to the `main`
branch as requested by updating the href to
https://github.com/NVIDIA/cuOpt-Resources/blob/main/notebooks/intra-factory_transport/cost_matrix_and_waypoint_graph_creation.ipynb
so the doc points to the existing resource.
In `@README.md`:
- Line 5: This PR mixes a release rollover into unrelated changes: revert or
remove the version bump strings "26.06" (and any install/packaging/CI changes
that reference 26.06) from this branch so the release/26.04 branch continues to
advertise 26.04; instead split the bump into a separate PR that updates all
README badges and install/packaging/CI entries (the repeated occurrences you
changed at the README badge and the other hunks referenced) or retarget this PR
to the correct release branch (PR 459 -> release/26.06) if the rollover belongs
here. Ensure you update or revert every occurrence of "26.06" you added (badges,
install commands, packaging/CI entries) so only the intended commit set remains
in this PR.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6b2df145-ec1d-4dfa-90fb-7cc6117ccbbc
📒 Files selected for processing (45)
.claude-plugin/marketplace.json.cursor-plugin/plugin.json.github/workflows/build.yaml.github/workflows/build_test_publish_images.yaml.github/workflows/pr.yaml.github/workflows/test.yaml.github/workflows/trigger-breaking-change-alert.yamlRAPIDS_BRANCHREADME.mdVERSIONconda/environments/all_cuda-129_arch-aarch64.yamlconda/environments/all_cuda-129_arch-x86_64.yamlconda/environments/all_cuda-131_arch-aarch64.yamlconda/environments/all_cuda-131_arch-x86_64.yamlcpp/src/linear_programming/solve.cucpp/src/mip/solve.cudependencies.yamldocs/cuopt/source/cuopt-python/routing/routing-example.ipynbdocs/cuopt/source/faq.rstgemini-extension.jsonhelmchart/cuopt-server/Chart.yamlhelmchart/cuopt-server/values.yamlpython/cuopt/pyproject.tomlpython/cuopt_self_hosted/pyproject.tomlpython/cuopt_server/pyproject.tomlpython/libcuopt/pyproject.tomlskills/cuopt-developer/SKILL.mdskills/cuopt-installation-api-c/SKILL.mdskills/cuopt-installation-api-python/SKILL.mdskills/cuopt-installation-common/SKILL.mdskills/cuopt-installation-developer/SKILL.mdskills/cuopt-lp-milp-api-c/SKILL.mdskills/cuopt-lp-milp-api-cli/SKILL.mdskills/cuopt-lp-milp-api-python/SKILL.mdskills/cuopt-qp-api-c/SKILL.mdskills/cuopt-qp-api-cli/SKILL.mdskills/cuopt-qp-api-python/SKILL.mdskills/cuopt-routing-api-python/SKILL.mdskills/cuopt-server-api-python/SKILL.mdskills/cuopt-server-common/SKILL.mdskills/cuopt-user-rules/SKILL.mdskills/lp-milp-formulation/SKILL.mdskills/qp-formulation/SKILL.mdskills/routing-formulation/SKILL.mdskills/skill-evolution/SKILL.md
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (10)
README.md (1)
5-5:⚠️ Potential issue | 🟠 MajorPlease remove the 26.06 rollover from this 26.04-targeted PR.
This PR targets
release/26.04, but Line 5 (and the install pins at Line 86, 94, 102, 110, 121) now advertise26.06.*. That mixes release-rollover changes into an unrelated feature PR and can misdirect users on the 26.04 branch. Please either revert these26.06updates from this PR or retarget/split the PR accordingly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` at line 5, Remove the accidental 26.06 rollover from this release/26.04 PR by reverting any occurrences of "26.06" back to the intended "26.04" in the README: replace the version badge string "[]" and the install pin strings that contain "26.06" with the corresponding "26.04" values, or else undo those README edits entirely so the PR contains only the feature changes targeting release/26.04; ensure no other "26.06" tokens remain before merging.cpp/src/linear_programming/solve.cu (5)
888-902:⚠️ Potential issue | 🟠 MajorPreserve postsolved duals when
dual_postsolveis enabled.
presolver->undo()has just reconstructed full-problem duals/reduced costs, but the unconditionalNaNfill immediately discards them. That makessettings.dual_postsolveineffective in the presolve path.🔧 Suggested fix
- thrust::fill(rmm::exec_policy(op_problem.get_handle_ptr()->get_stream()), - dual_solution.data(), - dual_solution.data() + dual_solution.size(), - std::numeric_limits<f_t>::signaling_NaN()); - thrust::fill(rmm::exec_policy(op_problem.get_handle_ptr()->get_stream()), - reduced_costs.data(), - reduced_costs.data() + reduced_costs.size(), - std::numeric_limits<f_t>::signaling_NaN()); + if (!settings.dual_postsolve) { + thrust::fill(rmm::exec_policy(op_problem.get_handle_ptr()->get_stream()), + dual_solution.data(), + dual_solution.data() + dual_solution.size(), + std::numeric_limits<f_t>::signaling_NaN()); + thrust::fill(rmm::exec_policy(op_problem.get_handle_ptr()->get_stream()), + reduced_costs.data(), + reduced_costs.data() + reduced_costs.size(), + std::numeric_limits<f_t>::signaling_NaN()); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/linear_programming/solve.cu` around lines 888 - 902, The code unconditionally overwrites the reconstructed duals/reduced costs from presolver->undo() by filling dual_solution and reduced_costs with NaN, breaking settings.dual_postsolve; update the logic so the thrust::fill calls for dual_solution and reduced_costs only run when dual postsolve is disabled (e.g., check settings.dual_postsolve or equivalent flag) or when presolver->undo() did not produce valid values; locate the presolver->undo(...) call and the subsequent thrust::fill(...) calls for dual_solution and reduced_costs and guard/remove those fills when dual_postsolve is enabled so the reconstructed values are preserved.
402-424:⚠️ Potential issue | 🟠 MajorUse the remaining time budget for barrier.
timeris already passed in, but Line 403 resets barrier to the fullsettings.time_limit. After presolve/setup work, barrier can run past the overall limit while the dual-simplex path correctly usestimer.remaining_time().🔧 Suggested fix
- barrier_settings.time_limit = settings.time_limit; + barrier_settings.time_limit = timer.remaining_time();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/linear_programming/solve.cu` around lines 402 - 424, The barrier run is being given the full settings.time_limit rather than the remaining budget; change the time assignment for barrier_settings to use the provided timer's remaining time (e.g., set barrier_settings.time_limit = timer.remaining_time()) before calling dual_simplex::solve_linear_program_with_barrier<i_t,f_t>, ensuring you clamp to a non-negative value (and optionally cap to settings.time_limit) so the barrier phase cannot exceed the remaining overall time.
681-693:⚠️ Potential issue | 🔴 CriticalCopy
barrier_problembefore launchingdual_simplex_thread.The worker gets
dual_simplex_problemby reference on Line 687, and Line 693 copies the same object afterward. If the thread starts immediately,run_dual_simplex()can mutate it while the copy is in progress.🔧 Suggested fix
dual_simplex::user_problem_t<i_t, f_t> dual_simplex_problem = cuopt_problem_to_simplex_problem<i_t, f_t>(d_barrier_problem); + dual_simplex::user_problem_t<i_t, f_t> barrier_problem = dual_simplex_problem; // Create a thread for dual simplex std::unique_ptr< std::tuple<dual_simplex::lp_solution_t<i_t, f_t>, dual_simplex::lp_status_t, f_t, f_t, f_t>> sol_dual_simplex_ptr; std::thread dual_simplex_thread(run_dual_simplex_thread<i_t, f_t>, std::ref(dual_simplex_problem), std::ref(settings_pdlp), std::ref(sol_dual_simplex_ptr), std::ref(timer)); - - dual_simplex::user_problem_t<i_t, f_t> barrier_problem = dual_simplex_problem;As per coding guidelines: Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/linear_programming/solve.cu` around lines 681 - 693, The code launches dual_simplex_thread passing dual_simplex_problem by reference and only afterwards copies it into barrier_problem, creating a race if the thread mutates dual_simplex_problem; fix by making the copy before starting the thread (i.e., construct barrier_problem = dual_simplex_problem prior to creating dual_simplex_thread) or alternatively change the thread invocation (run_dual_simplex_thread) to take a const/value copy of dual_simplex_problem so the subsequent barrier_problem assignment cannot race with thread mutations; update references to dual_simplex_problem, barrier_problem, run_dual_simplex_thread, and dual_simplex_thread accordingly.
317-318:⚠️ Potential issue | 🔴 CriticalMake the concurrent halt flag per-solve and atomic.
global_concurrent_haltis process-wide shared state. In C++,volatiledoes not provide inter-thread synchronization, so overlappingsolve_lp()calls can interfere with each other and the worker threads in one concurrent solve race on this flag; the downstream writes on Lines 432, 503, and 631 inherit the same problem.#!/bin/bash # Inspect the shared halt-flag definition and all handoff/read/write sites. rg -nC2 '\bglobal_concurrent_halt\b|\bconcurrent_halt\b' --glob '**/*.{cu,cuh,cpp,hpp,h}'As per coding guidelines: Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/linear_programming/solve.cu` around lines 317 - 318, The global_concurrent_halt variable is process-wide and uses volatile (no synchronization); change it to a per-solve atomic flag and propagate it into the threads that read/write it (e.g., replace global_concurrent_halt with a std::atomic<bool> or std::atomic<int> member/local called concurrent_halt in the solve context passed into solve_lp() and worker threads). Update all read/write sites that touch global_concurrent_halt (the places in solve_lp() and the worker handoff/read/write locations) to use atomic load()/store() or operator= with appropriate memory ordering, add `#include` <atomic>, and remove the volatile global to eliminate inter-solve races. Ensure the flag’s lifetime is tied to a single solve invocation so concurrent solve_lp() calls do not share state.
535-538:⚠️ Potential issue | 🟠 MajorHandle zero-row LPs with the trivial/dual-simplex path instead of
NumericalError.A bound-only LP is still a valid model. In the plain PDLP path this branch misclassifies OPTIMAL/UNBOUNDED cases; in concurrent mode the dedicated dual-simplex worker already covers the fallback, so the fix likely needs to branch on the solve mode instead of hard-failing here.
As per coding guidelines: Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/linear_programming/solve.cu` around lines 535 - 538, When problem.n_constraints == 0 do not unconditionally return pdlp_termination_status_t::NumericalError; instead branch based on the PDLP solve mode and route bound-only problems to the dual-simplex path: detect the solve mode (concurrent vs plain), and for plain/standalone runs invoke or forward to the dual-simplex solver (e.g., call the existing dualSimplexSolve or the routine the concurrent worker uses) and return its optimization_problem_solution_t<i_t,f_t> result; for concurrent mode do not hard-fail here and allow the dedicated dual-simplex worker/fallback to handle it (i.e., remove the NumericalError return and replace with a dispatch/forward to the dual-simplex handling code).cpp/src/mip/solve.cu (4)
81-95:⚠️ Potential issue | 🔴 CriticalTreat improving infinite bounds as unbounded in the empty-model fast path.
A positive-cost variable with
-inflower bound or a negative-cost variable with+infupper bound is unbounded, but this branch writes that infinity intosolution.assignmentand keeps going. Zero-cost free variables have the same problem via0 * infincompute_objective(), so this should short-circuit to the solver’s unbounded status and only choose finite values for zero-cost variables.As per coding guidelines: Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip/solve.cu` around lines 81 - 95, The current fast-path assigns infinite bounds into solution.assignment (using pb.variable_bounds with get_lower/get_upper) and then calls solution.compute_objective(), which produces incorrect results for unbounded cases; update the block that sets solution.assignment (the lambda used in thrust::for_each) to check for unboundedness: if pb.objective_coefficients[index] > 0 and get_lower(bounds) is -inf (or < -INF_THRESHOLD) OR if pb.objective_coefficients[index] < 0 and get_upper(bounds) is +inf, immediately short-circuit and return the solver’s unbounded status instead of writing infinities into solution; for zero-cost variables (objective_coefficients[index] == 0) avoid assigning infinite bounds—choose a finite default (e.g., 0 or a finite bound) so compute_objective() stays valid; ensure stats.solution_bound and CUOPT_LOG_INFO remain correct for the unbounded-return path and use solution.get_solution(...) only for finite solutions.
123-129:⚠️ Potential issue | 🟠 MajorDon't scale caller-owned warm starts in place.
settingsis const, but this loop still mutates the pointed-to initial-solution buffers. Reusing the same settings object for another solve will double-scale those warm starts and leave the caller’s data in scaled space.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip/solve.cu` around lines 123 - 129, The loop is mutating caller-owned warm-start buffers by calling scaling.scale_primal(*initial_solution) on pointers in settings.initial_solutions; instead make copies and scale those copies (or populate an internal scaled_initial_solutions vector) so the original buffers are untouched. Concretely, for each entry in settings.initial_solutions create a local copy (or allocate a new internal buffer), call scaling.scale_primal on the copy, and use the copied/scaled buffer for the solver; do not call scaling.scale_primal directly on *initial_solution. Adjust usage of the solver to consume the internal scaled buffers instead of the original pointers.
203-205:⚠️ Potential issue | 🟠 MajorMirror LP's remaining-time + 1s presolve budget here.
This branch still computes the Papilo limit from the full
time_limitand allows sub-second budgets. The LP path already usesremaining_time()with a 1-second floor to avoid the early-timeout path.🔧 Suggested fix
- const double presolve_time_limit = std::min(0.1 * time_limit, 60.0); + const double presolve_time_limit = + std::max(1.0, std::min(0.1 * timer.remaining_time(), 60.0));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip/solve.cu` around lines 203 - 205, Replace the current presolve_time_limit calculation that uses the full time_limit with one that mirrors the LP path: call remaining_time(), floor it to at least 1.0 seconds (e.g., max(remaining_time(), 1.0)) and then set presolve_time_limit = min(0.1 * that_remaining, 60.0); update the expression where presolve_time_limit is computed in solve.cu (referencing presolve_time_limit and time_limit) so Papilo gets the same remaining-time + 1s floor behavior as the LP branch.
266-278:⚠️ Potential issue | 🟠 MajorDon't force the postsolved MIP result to be feasible.
full_sol.compute_feasibility()is already called and a warning is emitted on failure, but Line 278 still passestrue. TheFIXMEabove is also real:full_statsstill come from the reduced problem, so this reconstruction can return an infeasible postsolved assignment as feasible.As per coding guidelines: Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip/solve.cu` around lines 266 - 278, The code currently forces the postsolved solution to be marked feasible by calling full_sol.post_process_completed = true and then calling full_sol.get_solution(true, full_stats) while full_stats are from the reduced problem; instead, remove the hard-coded post_process_completed hack and stop passing true to get_solution — pass the true/false feasibility determined by full_sol.get_feasible() (or the result of full_sol.compute_feasibility()), and recompute the stats for the full/original problem (not use sol.get_stats()) before calling full_sol.get_solution(feasible_flag, full_stats). Ensure you update or compute full_stats from full_sol or the original problem context so index/constraint mappings and timings reflect the full problem when reconstructing the solution.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/src/linear_programming/solve.cu`:
- Around line 848-850: The current code uses settings.user_problem_file for both
original and presolved dumps, making it ambiguous; add a dedicated setting
(e.g., settings.presolved_problem_file or settings.presolve_file) and use it
when writing the reduced model via reduced_problem.write_to_mps (leave
settings.user_problem_file for the original model dump). Update the write
branches where reduced_problem.write_to_mps is called (the presolve branch that
checks settings.user_problem_file) to check and use the new
settings.presolved_problem_file instead, and ensure the other location still
writes the original model to settings.user_problem_file; also add the new option
to the CLI/config parsing so callers can supply a separate presolve filename.
In `@cpp/src/mip/solve.cu`:
- Around line 142-150: The call to scaled_sol.get_solution currently uses
(is_feasible_before_scaling || is_feasible_after_unscaling) which can return a
misleading feasible result when the two disagree; change it to use the
post-unscale/original-problem feasibility only by passing
is_feasible_after_unscaling to scaled_sol.get_solution (keep the other args:
solver.get_solver_stats(), false) so the returned solution reflects the original
problem feasibility after unscaling and rebinding of scaled_sol.problem_ptr.
- Around line 75-76: The PDLP hyperparameter block only sets
update_primal_weight_on_initial_solution and
update_step_size_on_initial_solution, leaving default_l_inf_ruiz_iterations and
default_alpha_pock_chambolle_rescaling uninitialized and risking state leakage;
update the pdlp_hyper_params object so all four fields
(update_primal_weight_on_initial_solution, update_step_size_on_initial_solution,
default_l_inf_ruiz_iterations, default_alpha_pock_chambolle_rescaling) are
explicitly set locally in the MIP scaling path (same pattern used in
linear_programming/solve.cu), ensuring the values are initialized per-solve and
not persisted after MIP returns.
---
Duplicate comments:
In `@cpp/src/linear_programming/solve.cu`:
- Around line 888-902: The code unconditionally overwrites the reconstructed
duals/reduced costs from presolver->undo() by filling dual_solution and
reduced_costs with NaN, breaking settings.dual_postsolve; update the logic so
the thrust::fill calls for dual_solution and reduced_costs only run when dual
postsolve is disabled (e.g., check settings.dual_postsolve or equivalent flag)
or when presolver->undo() did not produce valid values; locate the
presolver->undo(...) call and the subsequent thrust::fill(...) calls for
dual_solution and reduced_costs and guard/remove those fills when dual_postsolve
is enabled so the reconstructed values are preserved.
- Around line 402-424: The barrier run is being given the full
settings.time_limit rather than the remaining budget; change the time assignment
for barrier_settings to use the provided timer's remaining time (e.g., set
barrier_settings.time_limit = timer.remaining_time()) before calling
dual_simplex::solve_linear_program_with_barrier<i_t,f_t>, ensuring you clamp to
a non-negative value (and optionally cap to settings.time_limit) so the barrier
phase cannot exceed the remaining overall time.
- Around line 681-693: The code launches dual_simplex_thread passing
dual_simplex_problem by reference and only afterwards copies it into
barrier_problem, creating a race if the thread mutates dual_simplex_problem; fix
by making the copy before starting the thread (i.e., construct barrier_problem =
dual_simplex_problem prior to creating dual_simplex_thread) or alternatively
change the thread invocation (run_dual_simplex_thread) to take a const/value
copy of dual_simplex_problem so the subsequent barrier_problem assignment cannot
race with thread mutations; update references to dual_simplex_problem,
barrier_problem, run_dual_simplex_thread, and dual_simplex_thread accordingly.
- Around line 317-318: The global_concurrent_halt variable is process-wide and
uses volatile (no synchronization); change it to a per-solve atomic flag and
propagate it into the threads that read/write it (e.g., replace
global_concurrent_halt with a std::atomic<bool> or std::atomic<int> member/local
called concurrent_halt in the solve context passed into solve_lp() and worker
threads). Update all read/write sites that touch global_concurrent_halt (the
places in solve_lp() and the worker handoff/read/write locations) to use atomic
load()/store() or operator= with appropriate memory ordering, add `#include`
<atomic>, and remove the volatile global to eliminate inter-solve races. Ensure
the flag’s lifetime is tied to a single solve invocation so concurrent
solve_lp() calls do not share state.
- Around line 535-538: When problem.n_constraints == 0 do not unconditionally
return pdlp_termination_status_t::NumericalError; instead branch based on the
PDLP solve mode and route bound-only problems to the dual-simplex path: detect
the solve mode (concurrent vs plain), and for plain/standalone runs invoke or
forward to the dual-simplex solver (e.g., call the existing dualSimplexSolve or
the routine the concurrent worker uses) and return its
optimization_problem_solution_t<i_t,f_t> result; for concurrent mode do not
hard-fail here and allow the dedicated dual-simplex worker/fallback to handle it
(i.e., remove the NumericalError return and replace with a dispatch/forward to
the dual-simplex handling code).
In `@cpp/src/mip/solve.cu`:
- Around line 81-95: The current fast-path assigns infinite bounds into
solution.assignment (using pb.variable_bounds with get_lower/get_upper) and then
calls solution.compute_objective(), which produces incorrect results for
unbounded cases; update the block that sets solution.assignment (the lambda used
in thrust::for_each) to check for unboundedness: if
pb.objective_coefficients[index] > 0 and get_lower(bounds) is -inf (or <
-INF_THRESHOLD) OR if pb.objective_coefficients[index] < 0 and get_upper(bounds)
is +inf, immediately short-circuit and return the solver’s unbounded status
instead of writing infinities into solution; for zero-cost variables
(objective_coefficients[index] == 0) avoid assigning infinite bounds—choose a
finite default (e.g., 0 or a finite bound) so compute_objective() stays valid;
ensure stats.solution_bound and CUOPT_LOG_INFO remain correct for the
unbounded-return path and use solution.get_solution(...) only for finite
solutions.
- Around line 123-129: The loop is mutating caller-owned warm-start buffers by
calling scaling.scale_primal(*initial_solution) on pointers in
settings.initial_solutions; instead make copies and scale those copies (or
populate an internal scaled_initial_solutions vector) so the original buffers
are untouched. Concretely, for each entry in settings.initial_solutions create a
local copy (or allocate a new internal buffer), call scaling.scale_primal on the
copy, and use the copied/scaled buffer for the solver; do not call
scaling.scale_primal directly on *initial_solution. Adjust usage of the solver
to consume the internal scaled buffers instead of the original pointers.
- Around line 203-205: Replace the current presolve_time_limit calculation that
uses the full time_limit with one that mirrors the LP path: call
remaining_time(), floor it to at least 1.0 seconds (e.g., max(remaining_time(),
1.0)) and then set presolve_time_limit = min(0.1 * that_remaining, 60.0); update
the expression where presolve_time_limit is computed in solve.cu (referencing
presolve_time_limit and time_limit) so Papilo gets the same remaining-time + 1s
floor behavior as the LP branch.
- Around line 266-278: The code currently forces the postsolved solution to be
marked feasible by calling full_sol.post_process_completed = true and then
calling full_sol.get_solution(true, full_stats) while full_stats are from the
reduced problem; instead, remove the hard-coded post_process_completed hack and
stop passing true to get_solution — pass the true/false feasibility determined
by full_sol.get_feasible() (or the result of full_sol.compute_feasibility()),
and recompute the stats for the full/original problem (not use sol.get_stats())
before calling full_sol.get_solution(feasible_flag, full_stats). Ensure you
update or compute full_stats from full_sol or the original problem context so
index/constraint mappings and timings reflect the full problem when
reconstructing the solution.
In `@README.md`:
- Line 5: Remove the accidental 26.06 rollover from this release/26.04 PR by
reverting any occurrences of "26.06" back to the intended "26.04" in the README:
replace the version badge string "[]" and
the install pin strings that contain "26.06" with the corresponding "26.04"
values, or else undo those README edits entirely so the PR contains only the
feature changes targeting release/26.04; ensure no other "26.06" tokens remain
before merging.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: de27d017-bbe8-4032-af02-26907275f819
📒 Files selected for processing (45)
.claude-plugin/marketplace.json.cursor-plugin/plugin.json.github/workflows/build.yaml.github/workflows/build_test_publish_images.yaml.github/workflows/pr.yaml.github/workflows/test.yaml.github/workflows/trigger-breaking-change-alert.yamlRAPIDS_BRANCHREADME.mdVERSIONconda/environments/all_cuda-129_arch-aarch64.yamlconda/environments/all_cuda-129_arch-x86_64.yamlconda/environments/all_cuda-131_arch-aarch64.yamlconda/environments/all_cuda-131_arch-x86_64.yamlcpp/src/linear_programming/solve.cucpp/src/mip/solve.cudependencies.yamldocs/cuopt/source/cuopt-python/routing/routing-example.ipynbdocs/cuopt/source/faq.rstgemini-extension.jsonhelmchart/cuopt-server/Chart.yamlhelmchart/cuopt-server/values.yamlpython/cuopt/pyproject.tomlpython/cuopt_self_hosted/pyproject.tomlpython/cuopt_server/pyproject.tomlpython/libcuopt/pyproject.tomlskills/cuopt-developer/SKILL.mdskills/cuopt-installation-api-c/SKILL.mdskills/cuopt-installation-api-python/SKILL.mdskills/cuopt-installation-common/SKILL.mdskills/cuopt-installation-developer/SKILL.mdskills/cuopt-lp-milp-api-c/SKILL.mdskills/cuopt-lp-milp-api-cli/SKILL.mdskills/cuopt-lp-milp-api-python/SKILL.mdskills/cuopt-qp-api-c/SKILL.mdskills/cuopt-qp-api-cli/SKILL.mdskills/cuopt-qp-api-python/SKILL.mdskills/cuopt-routing-api-python/SKILL.mdskills/cuopt-server-api-python/SKILL.mdskills/cuopt-server-common/SKILL.mdskills/cuopt-user-rules/SKILL.mdskills/lp-milp-formulation/SKILL.mdskills/qp-formulation/SKILL.mdskills/routing-formulation/SKILL.mdskills/skill-evolution/SKILL.md
✅ Files skipped from review due to trivial changes (35)
- VERSION
- helmchart/cuopt-server/Chart.yaml
- skills/cuopt-user-rules/SKILL.md
- skills/cuopt-qp-api-c/SKILL.md
- skills/cuopt-installation-api-python/SKILL.md
- .cursor-plugin/plugin.json
- docs/cuopt/source/faq.rst
- RAPIDS_BRANCH
- skills/lp-milp-formulation/SKILL.md
- skills/cuopt-lp-milp-api-cli/SKILL.md
- gemini-extension.json
- skills/cuopt-lp-milp-api-c/SKILL.md
- .github/workflows/build_test_publish_images.yaml
- skills/cuopt-routing-api-python/SKILL.md
- skills/qp-formulation/SKILL.md
- .github/workflows/trigger-breaking-change-alert.yaml
- skills/routing-formulation/SKILL.md
- skills/cuopt-qp-api-python/SKILL.md
- .claude-plugin/marketplace.json
- conda/environments/all_cuda-131_arch-aarch64.yaml
- skills/cuopt-qp-api-cli/SKILL.md
- skills/cuopt-server-common/SKILL.md
- conda/environments/all_cuda-129_arch-x86_64.yaml
- skills/cuopt-installation-api-c/SKILL.md
- skills/cuopt-developer/SKILL.md
- python/cuopt_server/pyproject.toml
- skills/cuopt-installation-common/SKILL.md
- python/cuopt_self_hosted/pyproject.toml
- helmchart/cuopt-server/values.yaml
- docs/cuopt/source/cuopt-python/routing/routing-example.ipynb
- conda/environments/all_cuda-131_arch-x86_64.yaml
- skills/cuopt-installation-developer/SKILL.md
- skills/skill-evolution/SKILL.md
- skills/cuopt-lp-milp-api-python/SKILL.md
- dependencies.yaml
🚧 Files skipped from review as they are similar to previous changes (5)
- conda/environments/all_cuda-129_arch-aarch64.yaml
- .github/workflows/test.yaml
- skills/cuopt-server-api-python/SKILL.md
- python/cuopt/pyproject.toml
- python/libcuopt/pyproject.toml
|
/ok to test 5698cc1 |
chris-maes
left a comment
There was a problem hiding this comment.
LGTM. Thanks for this useful feature @hlinsen
|
/merge |
f80d6cf
into
NVIDIA:release/26.04
This PR adds a new option to write the presolved model if presolved is applied.