ord: add cross-process CPU throttle to prevent overcommit#10053
ord: add cross-process CPU throttle to prevent overcommit#10053oharboe wants to merge 9 commits intoThe-OpenROAD-Project:masterfrom
Conversation
When multiple OpenROAD instances run in parallel (e.g. Bazel tests), each creating its own thread pool, total threads can far exceed available cores. This adds an always-on, zero-config throttle using per-CPU file locks in /tmp/openroad_cpu_sem/. Each instance acquires N slots (matching its thread count) before proceeding, blocking if the machine is fully committed. Crash-safe: OS releases locks on process death. Linux-only; no-op on macOS. Use -disable_throttle for debugging (e.g. tracking down non-determinism). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
There was a problem hiding this comment.
Code Review
This pull request introduces a CPU throttling mechanism to coordinate thread usage across multiple OpenROAD instances using file-based locks on Linux. It includes a new CpuThreadThrottle class, documentation, and a command-line flag to disable the feature. The review feedback highlights several improvements for robustness and modern C++ standards, including the use of O_CLOEXEC to prevent file descriptor leakage, handling potential infinite loops during resource acquisition, replacing deprecated usleep with std::this_thread::sleep_for, and cleaning up unused member variables.
src/CpuThrottle.cpp
Outdated
| { | ||
| // Open and lock the coordinator to serialize allocation attempts. | ||
| // This prevents deadlock: only one process tries to grab CPU slots at a time. | ||
| int coord_fd = open(kCoordinatorLock, O_CREAT | O_RDWR, 0666); |
There was a problem hiding this comment.
The open call should include the O_CLOEXEC flag. This ensures that the file descriptor is automatically closed if the process executes another program (e.g., calling external tools like ABC), preventing file descriptor leakage.
| int coord_fd = open(kCoordinatorLock, O_CREAT | O_RDWR, 0666); | |
| int coord_fd = open(kCoordinatorLock, O_CREAT | O_RDWR | O_CLOEXEC, 0666); |
| if (coord_fd < 0) { | ||
| logger_->warn(utl::ORD, | ||
| 44, | ||
| "Throttle: cannot open coordinator lock: {}", | ||
| std::strerror(errno)); | ||
| return false; | ||
| } |
There was a problem hiding this comment.
If open fails due to a permanent error (such as EACCES if the directory permissions are incorrect), tryAcquire will return false and the constructor will loop indefinitely, spamming the log with warnings every 100ms. Consider handling permanent failures by either terminating the program using logger_->error or disabling the throttle mechanism for the current session.
References
- Calls to logger_->error are expected to terminate the program, so additional error handling like returning an error code or throwing an exception is not necessary.
src/CpuThrottle.cpp
Outdated
| i < total_cpus && static_cast<int>(acquired.size()) < num_threads; | ||
| ++i) { | ||
| std::string path = std::string(kSemDir) + "/cpu." + std::to_string(i); | ||
| int fd = open(path.c_str(), O_CREAT | O_RDWR, 0666); |
There was a problem hiding this comment.
| #ifdef __linux__ | ||
| #include <sys/file.h> | ||
| #include <sys/stat.h> | ||
| #include <unistd.h> |
src/CpuThrottle.cpp
Outdated
| #endif | ||
|
|
||
| #include <string> | ||
| #include <thread> |
src/CpuThrottle.cpp
Outdated
| num_threads); | ||
| logged_waiting = true; | ||
| } | ||
| usleep(100000); // 100ms |
There was a problem hiding this comment.
usleep is deprecated in POSIX.1-2008. Use std::this_thread::sleep_for instead. Additionally, using the kRetryDelayMs constant avoids the use of magic numbers, following the repository's general rules.
| usleep(100000); // 100ms | |
| std::this_thread::sleep_for(std::chrono::milliseconds(kRetryDelayMs)); |
References
- Define tunable parameters as named constants instead of using hardcoded magic numbers.
src/CpuThrottle.h
Outdated
| int coordinator_fd_ = -1; | ||
| utl::Logger* logger_; |
There was a problem hiding this comment.
The coordinator_fd_ member variable is unused and should be removed. Additionally, defining kRetryDelayMs here avoids magic numbers in the implementation.
| int coordinator_fd_ = -1; | |
| utl::Logger* logger_; | |
| static constexpr int kRetryDelayMs = 100; | |
| utl::Logger* logger_; |
References
- Define tunable parameters as named constants instead of using hardcoded magic numbers.
src/CpuThrottle.cpp
Outdated
|
|
||
| if (num_threads > total_cpus) { | ||
| num_threads = total_cpus; | ||
| } |
There was a problem hiding this comment.
warning: use std::min instead of > [readability-use-std-min-max]
src/CpuThrottle.cpp:10:
- #include <cerrno>
+ #include <algorithm>
+ #include <cerrno>| } | |
| num_threads = std::min(num_threads, total_cpus); |
| } | ||
|
|
||
| CpuThreadThrottle::CpuThreadThrottle(CpuThreadThrottle&& other) noexcept | ||
| : held_fds_(std::move(other.held_fds_)), |
There was a problem hiding this comment.
warning: no header providing "std::move" is directly included [misc-include-cleaner]
src/CpuThrottle.cpp:4:
+ #include <utility>
src/CpuThrottle.cpp
Outdated
| { | ||
| // Open and lock the coordinator to serialize allocation attempts. | ||
| // This prevents deadlock: only one process tries to grab CPU slots at a time. | ||
| int coord_fd = open(kCoordinatorLock, O_CREAT | O_RDWR, 0666); |
There was a problem hiding this comment.
warning: no header providing "O_CREAT" is directly included [misc-include-cleaner]
.
^
src/CpuThrottle.cpp
Outdated
| { | ||
| // Open and lock the coordinator to serialize allocation attempts. | ||
| // This prevents deadlock: only one process tries to grab CPU slots at a time. | ||
| int coord_fd = open(kCoordinatorLock, O_CREAT | O_RDWR, 0666); |
There was a problem hiding this comment.
warning: no header providing "O_RDWR" is directly included [misc-include-cleaner]
.
^
src/CpuThrottle.cpp
Outdated
| { | ||
| // Open and lock the coordinator to serialize allocation attempts. | ||
| // This prevents deadlock: only one process tries to grab CPU slots at a time. | ||
| int coord_fd = open(kCoordinatorLock, O_CREAT | O_RDWR, 0666); |
There was a problem hiding this comment.
warning: no header providing "open" is directly included [misc-include-cleaner]
src/CpuThrottle.cpp:4:
+ #include <fcntl.h>| return false; | ||
| } | ||
|
|
||
| std::vector<int> acquired; |
There was a problem hiding this comment.
warning: no header providing "std::vector" is directly included [misc-include-cleaner]
src/CpuThrottle.cpp:4:
+ #include <vector>
src/OpenRoad.cc
Outdated
| sta_->setThreadCount(threads_); | ||
|
|
||
| if (!disable_throttle_ && threads_ > 0) { | ||
| cpu_throttle_ = std::make_unique<CpuThreadThrottle>(threads_, logger_); |
There was a problem hiding this comment.
warning: no header providing "std::make_unique" is directly included [misc-include-cleaner]
src/OpenRoad.cc:10:
- #include <stdexcept>
+ #include <memory>
+ #include <stdexcept>|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a cross-process CPU throttle mechanism using file-based locks to prevent core oversubscription when running multiple OpenROAD instances. It includes a new CpuThreadThrottle class, RAII guards for thread management, and a Python-based monitoring tool. Feedback focuses on improving robustness in multi-user environments by handling umask and file permissions (using O_RDONLY for locks), preventing log spam during contention, and ensuring the throttle is initialized by default. Additionally, several calls to acquireGlobalCpuThreads should be updated to explicitly pass the local thread count to ensure accurate slot reservation.
src/CpuThrottle.cpp
Outdated
| mkdir(kSemDir, 0777); | ||
| // Ignore EEXIST — directory may already exist from another process. |
There was a problem hiding this comment.
The mkdir call is subject to the process's umask. In a multi-user environment, if the first user to run OpenROAD has a restrictive umask (e.g., 0022), the directory will be created with 0755 permissions, preventing other users from creating lock files inside it. To ensure the directory is truly shared, consider calling chmod(kSemDir, 0777) if mkdir succeeds.
src/CpuThrottle.cpp
Outdated
| bool CpuThreadThrottle::tryAcquireAdditional(int additional) | ||
| { | ||
| // Open and lock the coordinator to serialize allocation attempts. | ||
| int coord_fd = open(kCoordinatorLock, O_CREAT | O_RDWR | O_CLOEXEC, 0666); |
There was a problem hiding this comment.
Using O_RDWR for the coordinator lock requires write permission on the file. If the file was created by another user with a restrictive umask (e.g., 0644), subsequent users will fail to open it in O_RDWR mode. Since flock only requires an open file descriptor and does not check the open mode for exclusive locks, using O_RDONLY is more robust for a machine-wide shared resource.
| int coord_fd = open(kCoordinatorLock, O_CREAT | O_RDWR | O_CLOEXEC, 0666); | |
| int coord_fd = open(kCoordinatorLock, O_CREAT | O_RDONLY | O_CLOEXEC, 0666); |
src/CpuThrottle.cpp
Outdated
| logger_->warn(utl::ORD, | ||
| 80, | ||
| "Throttle: cannot open coordinator lock: {}", | ||
| std::strerror(errno)); | ||
| return false; |
There was a problem hiding this comment.
This warning is inside a loop that retries every 100ms for up to 30 seconds. If the error is persistent (e.g., EACCES due to permission issues), it will spam the log with hundreds of identical warnings. Consider logging the error only once or handling persistent failures by disabling the throttle for the current session.
References
- When analyzing code within a loop, consider the entire loop structure to avoid redundant operations or repetitive logging within the loop body.
src/CpuThrottle.cpp
Outdated
| i < total_cpus_ && static_cast<int>(acquired.size()) < additional; | ||
| ++i) { | ||
| std::string path = std::string(kSemDir) + "/cpu." + std::to_string(i); | ||
| int fd = open(path.c_str(), O_CREAT | O_RDWR | O_CLOEXEC, 0666); |
There was a problem hiding this comment.
Similar to the coordinator lock, using O_RDONLY for the per-CPU lock files is preferred to avoid permission issues when multiple users share the same machine-wide pool.
| int fd = open(path.c_str(), O_CREAT | O_RDWR | O_CLOEXEC, 0666); | |
| int fd = open(path.c_str(), O_CREAT | O_RDONLY | O_CLOEXEC, 0666); |
| if (!disable_throttle_ && !cpu_throttle_ && threads_ > 0) { | ||
| cpu_throttle_ = std::make_unique<CpuThreadThrottle>(logger_); | ||
| setGlobalCpuThrottle(cpu_throttle_.get()); | ||
| } |
There was a problem hiding this comment.
The throttle is currently only initialized if setThreadCount is called (e.g., via the -threads command line flag). If OpenROAD is run with default settings, the throttle remains uninitialized, and the process will not participate in machine-wide CPU coordination. To ensure the throttle is "always-on" as described, it should be initialized during OpenRoad startup even if no explicit thread count is provided.
src/drt/src/TritonRoute.cpp
Outdated
| void TritonRoute::updateDesign(const std::vector<std::string>& updatesStrs, | ||
| int num_threads) | ||
| { | ||
| auto cpu_guard = ord::acquireGlobalCpuThreads(); |
There was a problem hiding this comment.
The acquireGlobalCpuThreads() call should explicitly use the num_threads parameter to ensure the throttle reserves the correct number of slots for the subsequent OpenMP parallel section.
| auto cpu_guard = ord::acquireGlobalCpuThreads(); | |
| auto cpu_guard = ord::acquireGlobalCpuThreads(num_threads); |
src/drt/src/TritonRoute.cpp
Outdated
|
|
||
| void TritonRoute::updateDesign(const std::string& path, int num_threads) | ||
| { | ||
| auto cpu_guard = ord::acquireGlobalCpuThreads(); |
src/drt/src/TritonRoute.cpp
Outdated
| void TritonRoute::sendDesignUpdates(const std::string& router_cfg_path, | ||
| int num_threads) | ||
| { | ||
| auto cpu_guard = ord::acquireGlobalCpuThreads(); |
src/grt/src/RepairAntennas.cpp
Outdated
|
|
||
| arc_->makeNetWiresFromGuides(nets_to_repair); | ||
| arc_->initAntennaRules(); | ||
| auto cpu_guard = ord::acquireGlobalCpuThreads(); |
src/par/src/Multilevel.cpp
Outdated
| } | ||
|
|
||
| // Parallel refine all the solutions | ||
| auto cpu_guard = ord::acquireGlobalCpuThreads(); |
There was a problem hiding this comment.
| // Create the throttle on first call (holds 1 slot for main thread). | ||
| // It will be resized to threads_ dynamically when tools need parallelism. | ||
| if (!disable_throttle_ && !cpu_throttle_ && threads_ > 0) { | ||
| cpu_throttle_ = std::make_unique<CpuThreadThrottle>(logger_); |
There was a problem hiding this comment.
warning: no header providing "std::make_unique" is directly included [misc-include-cleaner]
src/OpenRoad.cc:10:
- #include <stdexcept>
+ #include <memory>
+ #include <stdexcept>| @@ -29,6 +29,7 @@ | |||
| #include "odb/dbWireGraph.h" | |||
| #include "odb/geom.h" | |||
| #include "omp.h" | |||
There was a problem hiding this comment.
warning: 'omp.h' file not found [clang-diagnostic-error]
#include "omp.h"
^| @@ -53,6 +53,7 @@ | |||
| #include "odb/dbTypes.h" | |||
| #include "odb/geom.h" | |||
| #include "omp.h" | |||
There was a problem hiding this comment.
warning: 'omp.h' file not found [clang-diagnostic-error]
#include "omp.h"
^| @@ -32,6 +32,7 @@ | |||
| #include "odb/dbTypes.h" | |||
| #include "odb/geom.h" | |||
| #include "omp.h" | |||
There was a problem hiding this comment.
warning: 'omp.h' file not found [clang-diagnostic-error]
#include "omp.h"
^| @@ -30,6 +30,7 @@ | |||
| #include "global.h" | |||
| #include "odb/db.h" | |||
| #include "omp.h" | |||
There was a problem hiding this comment.
warning: 'omp.h' file not found [clang-diagnostic-error]
#include "omp.h"
^| @@ -19,6 +19,7 @@ | |||
| #include "global.h" | |||
| #include "odb/dbTypes.h" | |||
| #include "omp.h" | |||
There was a problem hiding this comment.
warning: 'omp.h' file not found [clang-diagnostic-error]
#include "omp.h"
^| @@ -24,6 +24,7 @@ | |||
| #include "odb/dbTypes.h" | |||
| #include "odb/geom.h" | |||
| #include "omp.h" | |||
There was a problem hiding this comment.
warning: 'omp.h' file not found [clang-diagnostic-error]
#include "omp.h"
^|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a cross-process CPU throttling mechanism to prevent core overcommitment by coordinating thread usage across multiple OpenROAD instances using file locks in /tmp. The implementation includes the CpuThreadThrottle class, a Python monitoring script, and integration into parallelized tools such as TritonRoute, FlexDR, and AntennaChecker. Review feedback highlights the need for graceful failure during initialization to avoid terminal errors on permission issues and suggests optimizing the slot acquisition loop to prevent unnecessary delays if the throttle is disabled.
| logger_->error(utl::ORD, | ||
| 80, | ||
| "Throttle: cannot open coordinator lock: {}", | ||
| std::strerror(errno)); | ||
| } |
There was a problem hiding this comment.
Using logger_->error here makes the CPU throttle initialization fatal. If OpenROAD cannot access the coordination files in /tmp (e.g., due to permission conflicts between different users on the same machine or a full disk), the entire program will terminate. Since the CPU throttle is a best-effort optimization to prevent overcommit, it should fail gracefully. Consider logging a warning and disabling the throttle for the remainder of the execution instead of terminating the process. This is consistent with the repository rule that logger_->error calls are terminal and should not be used for non-fatal initialization failures.
if (coord_fd < 0) {
logger_->warn(utl::ORD,
80,
"Throttle: cannot open coordinator lock: {}. Disabling throttle.",
std::strerror(errno));
total_cpus_ = 0;
return false;
}References
- Calls to logger_->error are expected to terminate the program, so additional error handling like returning an error code or throwing an exception is not necessary.
|
|
||
| bool logged_waiting = false; | ||
| int retries = 0; | ||
| while (!tryAcquireAdditional(additional)) { |
There was a problem hiding this comment.
If tryAcquireAdditional fails and disables the throttle (e.g., by setting total_cpus_ to 0 due to a permanent error), this loop will continue to retry until it times out (30 seconds), causing a significant delay. The loop condition should check if the throttle is still enabled.
| while (!tryAcquireAdditional(additional)) { | |
| while (total_cpus_ > 0 && !tryAcquireAdditional(additional)) { |
|
This feels like something the build system should be handling instead of trying to bake it into openroad. |
This is a two stage rocket:
|
Refactor CpuThrottle from fixed-at-creation to dynamic resize model: tools acquire slots on-demand before parallel sections and release them afterward, improving slot utilization across concurrent instances. Add etc/cpu_throttle_monitor.py for external monitoring of slot utilization via /proc/locks (read-only, no interference with locks). Add contention tracking (wait duration, resize count) with summary logged at shutdown. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
Guard the real throttle implementation behind ENABLE_THROTTLE (defined only in Bazel). CMake builds compile no-op stubs from the same .cpp file, fixing the incomplete-type error in dst test stubs without header churn. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
include/ord/CpuThrottle.h
Outdated
| { | ||
| public: | ||
| CpuThreadGuard(CpuThreadThrottle* throttle, int num_threads); | ||
| ~CpuThreadGuard(); |
There was a problem hiding this comment.
warning: class 'CpuThreadGuard' can be made trivially destructible by defaulting the destructor on its first declaration [performance-trivially-destructible]
src/CpuThrottle.cpp:344:
- )
- ault;
+ )
+ ault| ~CpuThreadGuard(); | |
| ~CpuThreadGuard() = default; |
Additional context
src/CpuThrottle.cpp:344: destructor definition is here
)
^| // Copyright (c) 2022-2026, The OpenROAD Authors | ||
|
|
||
| #include "ord/CpuThrottle.h" | ||
| #include "ord/OpenRoad.hh" |
There was a problem hiding this comment.
warning: included header CpuThrottle.h is not used directly [misc-include-cleaner]
| #include "ord/OpenRoad.hh" | |
| #include "ord/OpenRoad.hh" |
Standard library includes and utl/Logger.h were inside #ifdef ENABLE_THROTTLE, making them invisible to clang-tidy in CMake builds (which don't define ENABLE_THROTTLE). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
| #include "ord/CpuThrottle.h" | ||
|
|
||
| #include <algorithm> | ||
| #include <chrono> |
There was a problem hiding this comment.
warning: included header algorithm is not used directly [misc-include-cleaner]
| #include <chrono> | |
| #include <chrono> |
|
|
||
| #include <algorithm> | ||
| #include <chrono> | ||
| #include <string> |
There was a problem hiding this comment.
warning: included header chrono is not used directly [misc-include-cleaner]
| #include <string> | |
| #include <string> |
| #include <algorithm> | ||
| #include <chrono> | ||
| #include <string> | ||
| #include <thread> |
There was a problem hiding this comment.
warning: included header string is not used directly [misc-include-cleaner]
| #include <thread> | |
| #include <thread> |
| #include <chrono> | ||
| #include <string> | ||
| #include <thread> | ||
| #include <utility> |
There was a problem hiding this comment.
warning: included header thread is not used directly [misc-include-cleaner]
| #include <utility> | |
| #include <utility> |
| #include <string> | ||
| #include <thread> | ||
| #include <utility> | ||
| #include <vector> |
There was a problem hiding this comment.
warning: included header utility is not used directly [misc-include-cleaner]
| #include <vector> | |
| #include <vector> |
| #include <thread> | ||
| #include <utility> | ||
| #include <vector> | ||
|
|
There was a problem hiding this comment.
warning: included header vector is not used directly [misc-include-cleaner]
The _lib targets (ant, grt, drt, mpl, par, gpl) include ord/CpuThrottle.h but lack the project include/ directory in their include paths. Only the openroad executable and swig targets had it. Create an ord_lib INTERFACE library exposing the include path and link it from all affected _lib targets. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
Move std includes inside #ifdef ENABLE_THROTTLE since they are only used by the real implementation, not the no-op stubs. Remove unused CpuThrottle.h include from dst test stubs. Suppress false-positive trivially-destructible warning on CpuThreadGuard (destructor has real logic in ENABLE_THROTTLE builds). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
Pass the local threads parameter to acquireGlobalCpuThreads() in doInitialPlace, runMBFF, and doNesterovPlace instead of relying on the global target, which may differ from the actual thread count used in these functions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
|
/gemini review |
When multiple OpenROAD instances run in parallel (e.g. Bazel tests), each creating its own thread pool, total threads can far exceed available cores. This adds an always-on, zero-config throttle using per-CPU file locks in /tmp/openroad_cpu_sem/. Each instance acquires N slots (matching its thread count) before proceeding, blocking if the machine is fully committed. Crash-safe: OS releases locks on process death. Linux-only; no-op on macOS. Use -disable_throttle for debugging (e.g. tracking down non-determinism).
Type of Change