v4.1.0-rc3 — STruC++ compiler + Vendor Plugin Packages#129
Open
thiagoralves wants to merge 59 commits into
Open
v4.1.0-rc3 — STruC++ compiler + Vendor Plugin Packages#129thiagoralves wants to merge 59 commits into
thiagoralves wants to merge 59 commits into
Conversation
Enable the runtime to compile and load VPP (Vendor Plugin Package) plugins from uploaded PLC projects. Previously only plugin config files were deployed; now plugin source code is compiled on-target and loaded. Changes: - Extend compile.sh to compile VPP plugin source from core/generated/vpp_plugin/ into a shared library (.so). Uses SHA-256 checksum caching to skip recompilation when source is unchanged. Cleans up previously compiled VPP plugins when upload has none. - Extend plcapp_management.py update_plugin_configurations() to detect compiled VPP plugin .so files in the build directory and register them in plugins.conf. Handles both first-time deployment (add new entry) and subsequent uploads (update path). Disabling is handled by the existing config-matching logic when no VPP config is present. - Add has_plugin(), add_plugin(), update_plugin_path() helper methods to PluginsConfiguration in plugin_config_model.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The VPP plugin needs to link plugin_logger.c from the runtime. Pass RUNTIME_ROOT so the Makefile can locate it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move cJSON library from per-plugin copies (S7Comm, EtherCAT) to a
shared location at core/src/drivers/plugins/native/cjson/. This
eliminates duplication and provides a single canonical copy that all
native plugins link against — same pattern as plugin_logger.
Changes:
- Add core/src/drivers/plugins/native/cjson/cJSON.{c,h}
- Remove s7comm/cjson/ and ethercat/cjson/ duplicate copies
- Update both CMakeLists.txt to reference shared OPENPLC_ROOT path
- Fix ethercat_plugin.c include from "cjson/cJSON.h" to "cJSON.h"
- Add cjson include path to compile.sh for VPP plugin builds
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a VPP plugin is registered in plugins.conf during a program upload, plugin_driver_update_config copies the config but doesn't load the .so symbols. The subsequent plugin_driver_init then skips the plugin because native_plugin is NULL, causing "does not have a start_loop function" warnings. Fix: after copying each plugin config in update_config, check if the plugin is native and has no loaded symbols yet, and call native_plugin_get_symbols for it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a VPP plugin is recompiled on a subsequent program upload, the new .so file is written to disk but the runtime kept using the old dlopen'd handle from the previous load. This meant code changes in the VPP plugin never took effect without a full runtime restart. Fix: in plugin_driver_update_config, always dlclose and reload native plugins instead of skipping them when native_plugin is non-NULL. This ensures freshly compiled .so files are loaded on every upload. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After a program upload, run_compile sequence was: 1. compile.sh 2. stop_plc() — sends STOP, returns "STOP:OK" immediately 3. compile-clean.sh 4. start_plc() — sends START once, ignores response The problem: STOP kicks off a background transition thread that does the actual unload (pthread_join, plugin_driver_stop, plugin_manager_ destroy, …). While that thread is running, is_transitioning is still set and the unix-socket handler rejects any non-STATUS command with "COMMAND:BUSY". With the synergy plugin installed, stop_loop can take a second or more, easily outlasting a fast compile. START then lands during the BUSY window, gets rejected, and Python silently discards the response — PLC stays stopped until the user hits Play manually. Replace the single start_plc() call with a short retry loop in a new _restart_plc_after_build helper that polls for up to 5 seconds, re-sending START on BUSY / ERROR / empty responses and returning as soon as the runtime accepts it (START:OK or ALREADY_RUNNING). If the window is exhausted, log a warning instead of silently leaving the PLC stopped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously the runtime auto-started the PLC after a successful build with a local retry loop to work around the COMMAND:BUSY window left by the still-running STOP transition thread. That placed the retry policy on the runtime side, hidden from the caller. Move that responsibility to the editor: the runtime now finishes the build, resets crash tracking, and leaves the PLC stopped. The editor already polls /api/compilation-status; once status reports SUCCESS it will send START and handle BUSY retries with proper error surfacing for compilation errors, bad programs, etc. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds an optional get_stats(char *out, size_t) entry point to the native plugin ABI. Runtime aggregates outputs from all plugins that provide it into a plugin_stats object spliced into the existing STATS response. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds plugin_request_plc_stop_func_t to plugin_runtime_args_t. Native plugins call g_runtime_args.request_plc_stop(reason) when they detect an unrecoverable fault; the runtime then transitions from RUNNING to STOPPED via the same worker-thread pattern the STOP socket command uses, going through the normal plugin shutdown path (stop_loop on all plugins, unload program, clear image tables). begin_transition is renamed to plc_begin_transition and exported via unix_socket.h so plugin_driver.c can share the transition-guarded path instead of duplicating it. Same is_transitioning flag still blocks overlapping commands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously only scan_count was zeroed on start — min/max/avg fields carried over from the previous run, so a new program inherited stale metrics (and INT64_MAX sentinels would never recover once overwritten). Adds scan_cycle_stats_reset() that resets the full plc_timing_stats_t struct plus the internal expected_start_us/last_start_us trackers, called on both transition to RUNNING and in unload_plc_program so the STOPPED window also shows zeros. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resolved parallel additions to plugin_runtime_args_t: kept both request_plc_stop (VPP) and common_ticktime_ns (EtherCAT task selection) fields. EtherCAT plugin files auto-merged cleanly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
scan_cycle_stats_reset() is called from unload_plc_program after the PLC cycle thread has already exited — and the thread's exit path destroyed stats_mutex via scan_cycle_manager_cleanup(). Locking a destroyed mutex is undefined behavior; on glibc it typically hangs the calling thread, which looked exactly like the symptom reported: runtime unresponsive to socket commands, then the process dies. Fix: initialize the stats mutex once in plc_main startup and destroy it once at shutdown. The mutex now lives for the entire process and stays valid across any number of program load/unload cycles. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
connect() created self.sock before calling .connect(self.socket_path).
On connect failure the new socket was never closed and self.sock was
left pointing at the unconnected fd, so:
- is_connected() returned True for a dead socket
- the next connect() created yet another socket, orphaning the prior
fd — one leaked fd per failed reconnect
The supervisor's _monitor loop retries every 2s while the runtime is
down, which was enough to drain the process's fd budget and trigger
[Errno 24] Too many open files on the next subprocess.Popen.
Fix: close any prior self.sock up front, build the new socket in a
local variable, and only assign to self.sock on successful connect.
Close the failed fd on any exception path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
plugins.conf was in .gitignore AND tracked in the index, which is the classic git footgun: .gitignore only applies to untracked paths, so edits to the file still showed up in git status and blocked pulls with 'local changes would be overwritten'. This bit deployments that added plugins locally (e.g. the SLM-RP4 synergy plugin). plugins_default.conf remains the canonical tracked default; both the C runtime and webserver already auto-copy it to plugins.conf on first run when the latter is missing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
update_plugin_configurations was called once at upload time, before run_compile. That path only sees pre-built plugins — VPP plugins are compiled on the target during run_compile, so their .so files don't exist yet when the first pass runs. Result: plugins.conf never gets a synergy entry, the runtime ignores the plugin entirely, and I/O stays silent despite a successful build. Fix: call update_plugin_configurations a second time at the end of run_compile, after cleanup has finished and any lib*_plugin.so files are on disk. The VPP registration branch then finds them and writes the plugins.conf entry. Hold build_state.status in COMPILING across the second call so the editor doesn't poll SUCCESS and send START before plugins.conf is complete. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the MatIEC-shaped .so interface with the hierarchical surface
the editor and Arduino runtime already speak. The runtime sources that
walk strucpp's class hierarchy move to C++; everything else stays C.
The .so exports only:
- strucpp_get_config() (configuration entry)
- strucpp_set_locks() (mutex pointer plumbing)
- strucpp_get_located_vars() (per-project locatedVars[])
- strucpp_get_located_var_count()
- strucpp_debug_array_count / elem_count / size / set / read
Plus the data symbols generated.cpp already emits (common_ticktime__,
plc_program_md5).
Vendored strucpp runtime headers at v0.4.5 in core/strucpp_runtime/include/
— the runtime executable AND every user .so build include from this
single copy, so ABI is automatically consistent.
Static C-linkage shim at core/strucpp_runtime/runtime_v4_entry.cpp (~50
lines including the locatedVars accessors). The shim defines g_config
(Configuration_CONFIG0), exports the C-linkage entries, and activates
debug_dispatch.hpp's STRUCPP_V4_DEBUG_EXPORTS_DEFINE block. This file
is built into every user .so by scripts/compile.sh — the editor
upload bundle does not ship it.
Runtime side:
- core/src/plc_app/image_tables.{h,cpp} (renamed from .c): walks the
located-var descriptor table via the shim accessors, binds image-
table buffer pointers. Owns the recursive PI image-tables and globals
mutexes; hands their pointers to the .so via strucpp_set_locks.
- core/src/plc_app/plc_state_manager.{h,cpp} (renamed from .c): dlsyms
strucpp_get_config, walks ConfigurationInstance via virtual dispatch
(no per-walk shim functions needed). Single-thread cycle for now;
Phase 6 will replace with thread-per-task.
- core/src/plc_app/debug_handler.c: rewritten around hierarchical
(arr_idx, elem_idx) addressing. FC 0x41-0x45 dispatch with
request-snapshot fix lifted from Arduino. set_endianness symbol
removed; editor probes endianness via FC 0x45 echo.
- core/src/drivers/plugin_utils.{h,c}: deleted (flat-index variable API
is gone). plugin_types.h drops get_var_list/size/count fields from
plugin_runtime_args_t. OPC UA Python plugin breaks at link time —
Phase 9 will migrate it onto the hierarchical strucpp_debug_* API.
- scripts/compile.sh: rewritten for C++17 with the new bundle. Rejects
MatIEC artifacts in core/generated/ with exit code 2.
- core/src/CMakeLists.txt: project type C → CXX, C++17 standard, .cpp
sources, strucpp_runtime/include on the include path.
extern "C" guards added to scan_cycle_manager.h, plc_state_manager.h,
plcapp_manager.h, utils/utils.h, utils/log.h, plugin_driver.h so they're
includable from C++ TUs.
Validation on macOS:
- All modified C and C++ files pass clang -Wall -Wextra -Werror
-fsyntax-only (with stub Python.h for plugin_driver's transitive dep
and skipping utils.c which uses Linux-only sys/prctl.h).
- End-to-end: built a sample .so from a 2-task strucpp-generated
config, dlopened it, called strucpp_get_config + strucpp_set_locks +
strucpp_get_located_var_count + strucpp_debug_array_count via dlsym.
All 9 expected symbols present and callable; ALL CHECKS PASSED.
Full Linux build deferred (requires Python.h, sched_setscheduler,
SCHED_FIFO, mlockall — Linux-only).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
One pthread per IEC TASK at the priority declared in the user program.
Each task runs an absolute-deadline clock_nanosleep loop and acquires
the image-tables mutex around its program body. Per-thread crash
recovery via __thread current_task_ctx + sigsetjmp.
Changes:
- plc_state_manager.h: PlcTaskCtx struct (idx, interval_ns, priority,
cpu_affinity_mask, is_fastest_task, task_handle, pthread, sigjmp_buf,
heartbeat, local_tick). Dual-language atomic typedefs so the same
header compiles in both C (watchdog.c) and C++ TUs (plc_state_manager.cpp).
- plc_state_manager.cpp:
* plc_task_thread(): SCHED_FIFO priority elevation with priority
clamped to 1..99, optional CPU affinity (only when mask != 0),
per-thread sigsetjmp recovery, absolute-deadline scheduling on
Linux (clock_nanosleep CLOCK_MONOTONIC, TIMER_ABSTIME), nanosleep
fallback for macOS dev builds.
* plc_crash_handler(): __thread current_task_ctx routes signals to
the right siglongjmp target — task threads jump to their own
crash_jmp, the bootstrap to bootstrap_crash_jmp, anything else
gets the default handler back.
* plc_cycle_thread() (bootstrap): after setup, walks
ConfigurationInstance, allocates plc_tasks[] with one PlcTaskCtx
per IEC task, picks the fastest task (smallest interval_ns;
tie-break by priority then declaration order), spawns the task
threads, and waits for state change. On stop: signals SIGUSR1 to
each task to wake clock_nanosleep, then joins.
Phase 7 layers housekeeping (journal apply + plugin cycle hooks +
updateTime + tick++) onto the fastest task's thread by checking
ctx->is_fastest_task — already plumbed in PlcTaskCtx and set by the
bootstrap.
CPU affinity defaults to 0 (no pinning, kernel decides). The Phase 8
CPU_AFFINITY codegen extension will populate cpu_affinity_mask from
the user program; until that lands, all tasks run with affinity 0
unless explicitly overridden.
Validation on macOS:
- All modified files pass clang -Wall -Wextra -Werror -fsyntax-only
with the SCHED_FIFO / pthread_setaffinity_np / pthread_setname_np
blocks gated behind __linux__. macOS fallback uses nanosleep with a
relative deadline so the file at least syntax-checks.
Full multi-task / SCHED_FIFO / per-task watchdog tests need Linux
hardware — deferred.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the per-cycle I/O work — journal_apply_and_clear,
plugin_driver_cycle_start/end, updateTime, tick++ — onto the fastest
IEC task's thread. Other task threads run their bodies only. This is
the smallest possible drift from the MatIEC-era single-thread runtime:
single-task projects behave identically; multi-task projects piggyback
housekeeping on the highest-cadence task.
The fastest-task selection is already in Phase 6 (lowest interval_ns,
tie-break by priority then declaration order; ctx->is_fastest_task
gets set on exactly one PlcTaskCtx). Phase 7 just teaches plc_task_thread
to run plc_run_io_cycle_pre()/_post() around its body when that flag
is set.
Changes:
- core/src/plc_app/plc_io_cycle.{h,cpp}: new — pre/post helpers split
the housekeeping work so it can be invoked from one place. Pre runs
journal_apply_and_clear + plugin_driver_cycle_start; post runs
ext_updateTime + plugin_driver_cycle_end + plc_heartbeat update +
tick__++. Both run inside the image-tables critical section the
task body already holds.
- core/src/plc_app/plc_state_manager.cpp: plc_task_thread checks
ctx->is_fastest_task and wraps the body in scan_cycle_time_start +
plc_run_io_cycle_pre / plc_run_io_cycle_post + scan_cycle_time_end.
- core/src/CMakeLists.txt: add plc_io_cycle.cpp to the build.
No new threads, no new mutexes, no per-plugin config schema. The
runtime tracks a single global plc_heartbeat (fastest-task-driven)
plus per-task heartbeats (Phase 6). The watchdog can use either or
both.
Validation on macOS:
- All modified files pass clang -Wall -Wextra -fsyntax-only.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per review feedback: the runtime should be agnostic to which strucpp
version a user uploads, and the runtime targets Linux only — drop
both the header copy and the macOS dev paths.
Strucpp headers no longer vendored
- Delete core/strucpp_runtime/include/ (21 .hpp files) and STRUCPP_VERSION.
- New core/include/strucpp_abi.hpp: minimal layout-compatible mirror of
ConfigurationInstance, ResourceInstance, TaskInstance, ProgramBase,
LocatedVar (and the Located{Area,Size} enums). The runtime walks the
loaded .so via virtual dispatch using these mirrors instead of
including strucpp's full header set. Compile-time static_asserts on
sizeof/alignof/offsetof guard against ABI drift.
- core/src/plc_app/image_tables.cpp + plc_state_manager.cpp: include
strucpp_abi.hpp instead of iec_located.hpp / iec_std_lib.hpp.
- scripts/compile.sh: strucpp runtime headers now expected at
core/generated/strucpp_runtime/include/ (shipped with the user
upload), not in the runtime tree.
- core/src/CMakeLists.txt: include path switches from
core/strucpp_runtime/include to core/include.
- core/strucpp_runtime/README.md: rewritten to reflect the new
contract — the only file the runtime contributes per-.so is the
small static shim (runtime_v4_entry.cpp).
Linux-only target — drop macOS dev paths
- plc_state_manager.cpp: remove __APPLE__ / __linux__ guards. SCHED_FIFO
priority elevation, pthread_setname_np, pthread_setaffinity_np,
cpu_set_t, clock_nanosleep — all unconditional now. Drops the
nanosleep-relative fallback that existed for macOS dev syntax checks.
Add explicit #include <sched.h>.
- image_tables.cpp: PTHREAD_PRIO_INHERIT on the recursive mutex is
unconditional (was guarded against __APPLE__/__CYGWIN__/__MSYS__).
Validation on macOS
- ABI mirror layout-checked: standalone harness compares sizeof and
offsetof against the values strucpp v0.4.5 commits to in its
vendored docs. All static_asserts pass.
- End-to-end: built a sample .so against strucpp's actual headers
(from the strucpp source tree, simulating the user upload location);
dlopen'd it from a harness that walks the configuration through the
runtime's ABI mirror. cfg->get_name(), get_resource_count(),
get_resources() and the task array all return the expected values
(CONFIG0 → RES0 → FASTTASK 10ms pri 50 / SLOWTASK 100ms pri 30).
- image_tables.cpp passes -Wall -Wextra -Werror -fsyntax-only on
macOS. plc_state_manager.cpp expectedly fails on macOS now that
Linux guards are gone (pthread_setname_np 2-arg form, cpu_set_t).
Real Linux CI build is the next validation step.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous revision tried to enforce LocatedVar / TaskInstance / ResourceInstance sizes and offsets via static_assert. That's both brittle (32-bit ARMv7 builds in the runtime's Docker matrix have different layouts than 64-bit x86-64 / AArch64) and redundant — the runtime and the .so it loads are always built with the same compiler on the same platform, so identical struct field declarations produce identical layouts by construction. ABI consistency between the strucpp version a user .so was compiled against and the mirror declarations here is maintained as part of the development cycle. When strucpp ships a breaking ABI change, this file gets updated alongside the version bump. The mirror struct/class declarations are unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
core/include/ was a directory I introduced when I added the ABI mirror; the existing project convention is core/src/lib/ for runtime library headers (where iec_types.h already lives). Moving the file there matches the layout, removes the extra top-level dir, and lets the .cpp callers use the same `../lib/foo.h` relative-include pattern they already use for iec_types.h. Updates: - git mv core/include/strucpp_abi.hpp → core/src/lib/strucpp_abi.hpp - image_tables.cpp + plc_state_manager.cpp: include via "../lib/strucpp_abi.hpp" - CMakeLists.txt: drop core/include from include_directories - Empty core/include/ directory removed Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
End-to-end test on Linux uncovered four required-but-undefined symbols:
config_init__, updateTime, common_ticktime__, plc_program_md5. The
runtime dlsyms them as required and refused to start the program with
"failed to resolve all required .so symbols".
These symbols don't come from strucpp's generated.{cpp,hpp} — that was
Config0.c's job in the MatIEC pipeline. The Arduino integration
defines them in StrucppBaremetal/Baremetal.ino; the v4 shim is the
equivalent place for the openplc-runtime path.
Added to runtime_v4_entry.cpp:
- common_ticktime__: extern "C" unsigned long long, default 20 ms,
overwritten by config_init__ to the GCD of declared task intervals
so scan_cycle_manager sees a real base tick.
- plc_program_md5: extern "C" const char*. Default placeholder string;
the editor can override at .so build time by emitting a header that
#defines STRUCPP_PLC_PROGRAM_MD5. (TODO: wire that through compile.sh.)
- config_init__(): walks g_config.get_resources() and computes the GCD
into common_ticktime__. Static init has already constructed g_config
by the time the runtime calls this hook, so the walk is safe.
- updateTime(): increments strucpp::__CURRENT_TIME_NS by
common_ticktime__, matching CODESYS scan-cycle semantics. Called by
the runtime's plc_run_io_cycle_post each cycle of the fastest task.
Verified on macOS: all 13 expected symbols now exported by a sample
.so (was 9). dlopen harness drives config_init__ and confirms
common_ticktime__ flips from the default 20 ms to the GCD of a sample
project's task intervals (10 ms / 100 ms → 10 ms).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…cals
Mirrors the rename done editor-side for the Arduino sketch (commit
9da20b81). The historical MatIEC names had no meaning under STruC++
beyond convention; they survived in the runtime because the shim and
the runtime were the only two ends of the dlsym contract and renaming
required a coordinated change.
.so symbol contract changes (shim ↔ runtime via dlsym):
config_init__ (function) DROP. The shim used to compute the GCD
of declared task intervals. The
runtime already walks the same
ConfigurationInstance, so the
computation moves runtime-side
(compute_base_tick_from_config in
image_tables.cpp).
common_ticktime__ (variable) DROP. The runtime now owns base_tick_ns
directly (utils.c). No more dlsym
indirection through a .so-resident
pointer.
updateTime (function) → strucpp_advance_time(uint64_t tick_ns)
Now takes the tick as a parameter
(runtime owns base_tick_ns).
plc_program_md5 (variable) → strucpp_program_md5
Runtime-local renames:
tick__ → scan_counter
ext_common_ticktime__ (ptr) dropped (replaced by direct base_tick_ns)
ext_config_init__ dropped
ext_updateTime → ext_strucpp_advance_time
ext_plc_program_md5 → ext_strucpp_program_md5
Plugin runtime args:
common_ticktime_ns → base_tick_ns
Comments swept of "MatIEC-era" / "Config0.c" attribution where the
language was just legacy nomenclature. The compile.sh MatIEC rejection
guard is intentionally kept — it actively rejects MatIEC-shaped uploads
and the wording is the rejection message itself.
Old .so files compiled before this change will fail symbols_init with
the new names — that's the intent. Recompile via the editor's STruC++
pipeline to get a .so with the new contract.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the all-zeros placeholder with the actual MD5 of the loaded
program. The editor now writes core/generated/strucpp_program_md5.h
alongside generated.cpp during compile, defining
STRUCPP_PLC_PROGRAM_MD5 with the real hash. The shim includes that
header via __has_include — falling back to the placeholder only when
no editor-built program has been uploaded yet (raw runtime smoke
builds).
Also fix g++ warning:
warning: 'strucpp_program_md5' initialized and declared 'extern'
g++ parses the single-decl form `extern "C" const char *foo = ...`
ambiguously between language-linkage specifier and `extern` storage
class with initializer. Switch to block-form `extern "C" { ... }`,
which expresses C linkage without the warning.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Arduino sketch already reads the project MD5 from defines.h via the PROGRAM_MD5 macro. Align the v4 shim on the same convention so the editor only has to emit one canonical MD5 surface across targets. Replaces the previous strucpp_program_md5.h / STRUCPP_PLC_PROGRAM_MD5 combo. The v4 path now expects defines.h next to generated.cpp, shipped by the editor as part of the v4 conf-generation step. Falls back to the all-zeros placeholder when no editor-built program is loaded (raw runtime smoke builds). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A program loaded without an MD5 is broken and should fail loudly. Replace the __has_include guard + all-zeros fallback with a plain #include "defines.h" — missing file fails compile, undefined PROGRAM_MD5 fails link. Both are clear errors that point at the editor's emit step rather than silently masking a stale upload with a zeros hash. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The runtime's symbols_init does: *(void **)&ext_strucpp_program_md5 = dlsym(..., "strucpp_program_md5"); and debug_handler.c indexes ext_strucpp_program_md5[i] directly. If strucpp_program_md5 is defined as `const char *foo = "..."`, dlsym returns the address of the pointer variable, and the indexed reads surface raw pointer bytes as garbage (e.g. "▒▒[▒"). FC 0x45 then returns junk that never matches debug-map.json's md5. Defining the symbol as `const char strucpp_program_md5[] = PROGRAM_MD5` makes the symbol's address the start of the string itself — same shape the MatIEC-era plc_program_md5 used and what the runtime's pointer- indexing code expects. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
In C++, namespace-scope `const` gives INTERNAL linkage by default
(unlike C). With `extern "C" { const char strucpp_program_md5[] = ... }`
the symbol stays internal and dlsym returns NULL — runtime saw
ext_strucpp_program_md5 == NULL and FC 0x45 responded with
MB_DEBUG_ERROR_NOT_LOADED (0x83).
extern "C" is a *language* linkage specifier; it does not override
the C++ const-internal-linkage rule. Drop the const so the symbol
gets external linkage and is reachable via dlsym. The runtime treats
ext_strucpp_program_md5 as char* anyway.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
STruC++ doesn't know about Python POUs — same as iec2c didn't. The
editor's Python POU stub emits an {external} block that calls
\`getpid()\`, \`create_shm_name()\` and \`python_block_loader()\` by
name only, expecting their declarations to be supplied by the
toolchain. MatIEC's compile.sh handled this by passing
\`-include iec_python.h\` to the Config0.c invocation; mirror that
here for generated.cpp.
The header lives at core/src/plc_app/include/iec_python.h and brings
in <sys/types.h> + <unistd.h> too, so pid_t and getpid resolve in the
same step. Stubs without any Python POU pay zero — iec_python.h is
declarations only.
Without this, generated.cpp from any project with a Python POU fails
with three "was not declared in this scope" errors at the .so build.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
generated.cpp is mostly per-element IECVar assignments and generated_debug.cpp is mostly constexpr-evaluated address tables. -O2's aggressive inlining and vectorization buy very little here while doubling compile time on a Pi 4 — a 50-POU project was taking 12+ min, tripping the editor's status-polling timeout. -O1 typically halves the compile time on these files with no measurable impact on scan-cycle latency, since the hot paths (FB locks, image-tables binding, scan-cycle housekeeping) live in runtime_v4_entry.cpp / core/src/plc_app/ — those keep their optimization level via the top-level CMakeLists. -pipe avoids temp files between cc1plus and as for an additional 5–10% win. Arduino targets are unchanged (constrained-resource devices need -Os from the Arduino platform.txt for flash size). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Runtime build now drives g++ via Makefile.strucpp invoked through
make -j\$(nproc). Three wins compose:
- Parallel: per-file compilation rules saturate every available
core. Pi 4 builds drop from ~2 min to ~30 s on first run because
STruC++ now emits one TU per POU (configuration.cpp + pou_<NAME>.cpp
each), so make has real parallel work to schedule.
- ccache: when installed, all g++ invocations route through it.
Incremental rebuilds where only one POU's body changed reuse
every other .o from the cache. Drops typical edit-rebuild cycles
from minutes to a few seconds. Falls back transparently to bare
g++ on systems without ccache — no install requirement, no
error, just no incremental cache.
- Wildcard discovery: the Makefile's rule set is parametric over
\`wildcard \$(GENERATED_DIR)/*.cpp\` rather than a hardcoded list.
Whatever set STruC++ split into ends up compiled in one pass —
no Makefile churn when POUs are added/removed.
compile.sh is now a thin wrapper: validates the upload is STruC++-
shaped (rejects MatIEC's Config0.c / glueVars.c, requires generated.hpp
plus at least one .cpp), picks a parallel job count (nproc on Linux,
hw.ncpu on Darwin, 4 fallback), and execs make. The webserver's API
contract — POST upload → run compile.sh — is unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two follow-ups on the parallel-make / ccache work: 1. install.sh now pulls in `ccache` on every supported distro (apt, yum, dnf, pacman, zypper, apk). The runtime's Makefile.strucpp picks it up automatically — without ccache it still builds, just without the per-file caching. Calling it out explicitly here so users don't have to discover the install step on their own. ccache validates by hashing the *preprocessed source + compile flags + compiler version*, not by file mtime. The editor re-uploads every .cpp on every build, but ccache compares CONTENT, so unchanged TUs still hit the cache and skip recompilation entirely. 2. compile.sh: drop the macOS / Darwin / generic-fallback nproc detection. The runtime targets Linux only — `nproc` is always available on supported distros, no fallback needed. Three lines shorter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wire the editor's "Clean build and upload" option through to the runtime: when the upload request carries `?clean=1`, run_compile wipes core/build/ (the per-project Make object cache) and runs `ccache -C` before invoking compile.sh. Defaults preserve existing behaviour — older editors that don't pass the flag get the normal incremental build. Missing ccache is treated as non-fatal (the build/ wipe alone already invalidates everything the Makefile tracks; ccache is a transparent acceleration layer). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wire the existing strucpp_debug_* program-side exports through to the
plugin ABI so plugins (OPC-UA primarily) can read/write/force PLC
variables by (arr, elem) tuples — the same surface the editor's
debugger uses, just via in-process function pointers instead of the
Modbus PDU.
Six function pointers added to plugin_runtime_args_t:
debug_array_count() / debug_elem_count(arr) — table sizes
debug_size(arr, elem) — bytes consumed
debug_read(arr, elem, dest) — read current value
debug_set(arr, elem, forcing, bytes, len) — force / unforce
debug_write(arr, elem, bytes, len) — soft write (new)
The new debug_write op uses IECVar::set() rather than IECVar::force(),
giving OPC-UA / BACnet writes proper "value can be overwritten next
scan cycle" semantics — distinct from forcing which pins the value
indefinitely. Existing forces remain authoritative: a debug_write
while forced is silently ignored, matching the editor's debugger
behaviour.
Plumbing:
- plugin_types.h: 6 new typedefs + struct fields (added <stdbool.h>
for the `bool forcing` arg).
- plugin_driver.c: 6 NULL-safe thunks forwarding to ext_strucpp_debug_*.
- image_tables.{h,cpp}: ext_strucpp_debug_write pointer + symbols_init
resolution and NULL guard.
- plugin_runtime_args.py: replace removed get_var_list/size/count
ctypes entries with the 6 new debug_* mirrors. Also adds
base_tick_ns which was missing on the Python side (latent
field-misalignment bug for any plugin that touched fields after
journal_write_lint).
Requires the program .so to export strucpp_debug_write (added in
strucpp commit d2a3e27 on feat/debug-write). Older programs miss the
symbol → symbols_init fails fast; the editor must rebuild against the
updated runtime header.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Schema migration to match the new STruC++ debugger ABI:
- SimpleVariable / VariableField / ArrayVariable lose `index: int`,
gain `arr: int` + `elem: int`. Variables are addressed by the
same (uint8_t, uint16_t) tuple the runtime's strucpp_debug_*
C exports take.
- `initial_value` field removed everywhere. The plugin will read
the program's actual initial value via debug_read at server
startup — no more drift between the configured default and the
program's IECVar<T> default constructor.
- Duplicate-address validator updated: collects (arr, elem)
tuples (instead of flat ints), and arrays expand their full
length so a scalar overlapping an array element gets caught.
- opcua_types.py VariableNode / VariableMetadata follow suit:
`debug_var_index → arr+elem`. Adds a convenience `addr` property
on metadata returning the (arr, elem) tuple.
Template (opcua_config_template.json) updated to the new shape so
import_config_from_file + validate round-trips cleanly. Confirmed:
Variables: 9 — bool_var arr=0 elem=0 INT, int_var arr=0 elem=1 …
Structures: 1 — sensor_data fields at (0,10)..(0,12)
Arrays: 2 — int_array (0,20)..(0,24), real_array (0,25)..(0,28)
Validation OK
The plugin's address-space wiring + sync loops still reference the
old `index` attributes — those crash next phase covers (5b/5c).
This commit is the schema flip; the rest of the OPC-UA plugin
follows.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The MatIEC-era flat-index address space and its direct-ctypes-pointer
fast path are gone; replaced by uniform (arr, elem) addressing on
top of args.debug_read / args.debug_write / args.debug_set /
args.debug_size.
opcua_memory.py — rewritten as a thin typed codec:
- debug_read_value(args, arr, elem, datatype) → decoded Python value
or None when read fails (program not loaded, address out of bounds,
string-stub).
- debug_write_value — soft write through args.debug_write, calls
IECVar::set() in the runtime which respects existing forces and
lets the next scan cycle overwrite. This is what OPC-UA Write
semantics expect: a regular write, NOT a force.
- debug_force_value / debug_unforce — exposed for any future feature
that wants debugger-style pinning, but the OPC-UA write path uses
debug_write_value.
- initialize_variable_cache(args, addrs, datatypes) → builds a
(arr, elem) → VariableMetadata cache from args.debug_size + the
configured datatype. No more discovery via removed get_var_*
flat-index APIs.
- The 200+ LOC of MatIEC ctypes pointer arithmetic
(IEC_TIMESPEC / IEC_STRING / read_memory_direct /
write_timespec_direct / _validate_memory_address) is gone — those
structures were tied to MatIEC's address-table contract and have
no analog in the new debugger ABI.
opcua_types.py — VariableNode and VariableMetadata carry (arr, elem)
instead of a flat debug_var_index.
address_space.py — variable_nodes now keyed by (arr, elem); array
elements live at base + i within the same arr. Initial values pulled
from a small _type_default helper instead of the removed
initial_value config field — the first sync cycle overwrites the seed
with the program's actual value via debug_read.
synchronization.py — full rewrite:
- SynchronizationManager takes `args` (the PluginRuntimeArgs ctypes
struct) directly, no longer the SafeBufferAccess wrapper. Reaches
debug_read / debug_write / debug_array_count via attribute access.
- Top loop checks args.debug_array_count() to detect program-loaded.
- sync_opcua_to_runtime: per (arr, elem), encode the OPC-UA value
and call debug_write_value. TIME-family values get recombined
from (tv_sec, tv_nsec) into int64 nanoseconds matching strucpp's
on-wire encoding.
- sync_runtime_to_opcua: per (arr, elem), debug_read_value →
convert_value_for_opcua → write_attribute_value (with
SourceTimestamp / ServerTimestamp for subscription support).
- Array elements iterated directly via base + i; the dropped
"direct memory access" fast path is replaced by uniform
debug_read calls (200ms cycle, ~hundreds of vars typical — no
perf regression).
server.py — passes buffer_accessor.runtime_args to SyncManager.
SafeBufferAccess gains a .runtime_args attribute so this access is
explicit (the wrapper itself is dismantled in Phase 5d cleanup).
tests/pytest/plugins/opcua/test_memory.py removed — exclusively
covered the deleted direct-memory ctypes APIs (IEC_STRING /
IEC_TIMESPEC / read_memory_direct etc.). Replacement coverage for
the new debug_* helpers belongs in a separate test file with mocked
ctypes function pointers — out of scope for this refactor.
Ships with Phase 5a's schema flip; the plugin is internally
consistent again. Phase 5d will collapse the now-vestigial
SafeBufferAccess shim once nothing else needs it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The shared Python plugin SDK exposed three layers of flat-index
variable I/O (get_var_list / get_var_size / get_var_count /
get_var_value / get_var_*_batch) bound to MatIEC's address-table
ABI. After Phase 5 the OPC-UA plugin no longer calls any of them
and no other plugin ever did, so the entire stack comes out:
- shared/debug_utils.py — deleted (DebugUtils class, 495 LOC).
- shared/component_interfaces.py — IDebugUtils abstract interface
deleted (49 LOC).
- shared/safe_buffer_access_refactored.py — wrapper methods that
forwarded to DebugUtils deleted (44 LOC). The DebugUtils
field/init line goes too. A short docstring block stays at the
site explaining where the new typed helpers live
(opcua/opcua_memory.py: debug_read_value, debug_write_value,
debug_force_value, debug_unforce, initialize_variable_cache).
- shared/__init__.py — IDebugUtils removed from re-exports +
__all__.
SafeBufferAccess keeps its image-tables / journal / mutex / config
surfaces — those are the buffers the EtherCAT, S7Comm, and Modbus
plugins still rely on, and they're orthogonal to the variable-by-
(arr, elem) surface OPC-UA needs.
Smoke-tested: PluginRuntimeArgs ctypes layout still 528 bytes,
OPC-UA template still loads + validates with 9 vars / 1 struct /
2 arrays. No remaining grep hits for DebugUtils / IDebugUtils /
debug_utils outside __pycache__.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two bugs in run_compile() were causing the runtime to land in STOPPED
after a successful upload:
1. build_state.status was overwritten by the cleanup step's
wait_and_finish(). The check that gates start_plc() looked at the
final status, which was the cleanup result — so:
- compile SUCCESS + cleanup FAILED → no restart (PLC stays
STOPPED even though the build worked).
- compile FAILED + cleanup SUCCESS → start_plc called against
a stale .so.
wait_step() now just returns the bool exit-result without touching
shared state. We compose `compile_ok && cleanup_ok` once at the
end and update build_state.status from the combined result. The
start_plc gate uses the same combined boolean.
2. stop_plc() and start_plc() round-trip over a 500 ms Unix-socket
call. The runtime ACKs synchronously but the actual teardown /
bring-up (tasks, plugins, .so load/unload, Python loader) runs
asynchronously. compile-clean.sh would move build/new_libplc.so
into its timestamped name while the runtime was still tearing the
previous program down, then start_plc would fire while teardown
hadn't completed — explains the user-reported "ends in STOPPED"
on most uploads.
New _wait_for_plc_state() polls status_plc() until the runtime
reports the expected state (with a 30 s timeout for STOP, 15 s
for START). Inserted between stop_plc() and the cleanup script,
and again after start_plc() to confirm the new program reached
RUNNING.
Also surfaces start_plc's response in the build log so a failed
START is visible to the user instead of being swallowed silently.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nial logs
Two related quality-of-life fixes for OPC-UA writes from anonymous
clients:
1. Default-permissive anonymous when no user model is configured.
Previously _authenticate_anonymous() always assigned the viewer
role, which is read-only on the editor's default per-variable
permissions ({viewer:'r', operator:'r', engineer:'rw'}). Result:
"drop in OPC-UA, run the insecure profile, click connect from
UAExpert" gave you read-only access even though no user
restrictions were set up — write attempts came back as
BadUserAccessDenied.
New policy is driven by the user list:
- len(config.users) == 0 → anonymous gets the engineer role
(UserRole.Admin). The server is single-tenant-by-default;
there's no privilege model to enforce, so writes work
end-to-end without setup.
- users configured → anonymous stays as viewer. The admin
opted into a user model, so the unauthenticated path stays
read-only as before.
Per-variable permissions still apply on top of this — a variable
with `engineer: 'r'` is read-only even for the engineer role.
2. Suppress traceback noise for permission denials.
When a write IS denied, callbacks.py raises ua.UaError. That's
the documented asyncua API for rejecting a request — the wire
response is BadUserAccessDenied as the spec requires. asyncua's
process_message however catches every callback exception with
logger.exception(), so the runtime log fills with a 14-line
Python traceback for what is really a clean, expected denial.
Install a logging filter on asyncua.server.uaprocessor that
recognises our denial messages by the leading "Access denied:"
marker and drops the traceback. The one-line audit log from
callbacks.py ("DENY write for user … on node …") still fires.
Genuine processor errors (bad decode, broken requests) keep
their traceback.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Root cause of the user-reported "start_plc fails with COMMAND:BUSY" after a successful upload: plc_set_state() flips plc_state to the target value (e.g. STOPPED) near the top — that flip MUST happen first because the running task threads use `while (plc_get_state() == RUNNING)` to exit, and unload_plc_program() depends on those threads draining out before joining. Only after the state flips does plc_set_state() do the actual unload/load work, which can take seconds for projects with multiple plugins + Python blocks. While that work runs, the unix- socket dispatcher's `is_transitioning` flag stays at 1, so any non-PING/STATUS command gets COMMAND:BUSY. The webserver's _wait_for_plc_state(STOPPED) saw STATUS:STOPPED immediately (the flip is fast), assumed the runtime was ready, ran compile-clean.sh, and sent START — which hit the still-transitioning runtime and got COMMAND:BUSY. PLC ended in STOPPED. Two-line fix in unix_socket.c: if `is_transitioning` is set, return STATUS:TRANSITIONING instead of STATUS:STOPPED / STATUS:RUNNING. The flag is already maintained correctly (set in begin_transition, cleared at the end of transition_worker). External pollers now see the truth — STATUS:STOPPED only when the worker has finished and the next command is processable. The plc_state-flip ordering inside plc_set_state() stays as-is (state flip first, then the work) — moving the work first would deadlock since the task-stop signal would never fire. A comment in plc_set_state spells out why and points at the unix_socket gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… cleanup The pre-cleanup wait introduced in 5c114a0 polled status_plc() for "STOPPED". That was wrong — the actual intent is "wait until the runtime is settled and ready to accept the next command", which isn't the same thing as "wait for state == STOPPED": - If the PLC was never started (state == INIT or EMPTY), STATUS returns STATUS:INIT or STATUS:EMPTY. My poll never matched STOPPED so it stalled for the full 30 s timeout, then proceeded anyway. The user saw a 30-second stall after every clean upload. - If the PLC was running and stop_plc() flipped it to STOPPED, the wait worked but happened to use the new TRANSITIONING flag indirectly (STATUS:TRANSITIONING masks STATUS:STOPPED until the worker completes). Replace _wait_for_plc_state(STOPPED) with a separate _wait_for_plc_idle() that polls until STATUS does NOT contain TRANSITIONING. Any settled state — STOPPED, INIT, EMPTY, even RUNNING (e.g. if some command races with another) — is an acceptable signal that the runtime can process the next command. The old _wait_for_plc_state stays for the post-START path where "wait for RUNNING specifically" IS the right check. The previous commits assumed the user always had a program running before upload. The fix-up here handles the cold-start path that the user just hit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Editor's Stop button felt broken because the OPC-UA plugin held the runtime's stop sequence for ~10 s waiting on a "graceful" shutdown that almost always needed force-cancellation anyway. Three places contributed to the latency, all tightened: 1. plugin.py:stop_loop — graceful join was 10 s, forced join was 5 s on top of that. Tighten to 2 s graceful + 3 s forced. The forced path is the one that actually works (per the user's logs: "OPC UA server stopped" appeared right after "forcing event loop cancellation"), so the long grace period was just dead time while asyncua's server.stop() waited on connected clients. 2. server.py:_cleanup — asyncua's `await self.server.stop()` was unbounded. A UAExpert client left running across an editor Stop press kept it parked, ticking down the entire 10 s graceful budget. Wrapped in asyncio.wait_for(..., 1.5 s) — clients that don't disconnect by then get dropped when the listening sockets close, which is the right behaviour for a forced-stop scenario. 3. synchronization.py:run — the sync loop only checked is_running() at the TOP of each cycle. For a project with many variables, one sync cycle can take meaningful time (debug_read + asyncua write_attribute_value per variable). Add an extra is_running() check between sync_opcua_to_runtime and sync_runtime_to_opcua — bounded shutdown latency to one half-cycle in the worst case instead of one full cycle. Net: editor Stop should feel near-instant on idle stacks (no clients connected, no PLC running), and bounded by ~1.5 s when asyncua has to evict clients. Forced-cancel still backs everything up in case a hang somewhere we didn't anticipate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
load_plc_program() left `plc_program` alive on dlopen / pthread_create failure. Compile-clean rotates the libplc filename on every build (libplc_<ns_timestamp>.so), so the next "Start PLC" went straight back into load_plc_program with the now-stale so_path and failed with `cannot open shared object file: No such file or directory` — never re-running find_libplc_file to discover the new artifact. Destroy the manager and clear plc_program in both failure paths. plc_set_state's `if (plc_program == NULL)` guard then re-discovers the libplc on the next RUNNING transition. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 6 made each IEC task run on its own pthread, but
scan_cycle_manager kept a single global plc_timing_stats that only
the fastest task updated. The STATS endpoint reported one set of
numbers regardless of how many tasks the user declared, so the
slower tasks were invisible.
Refactor scan_cycle_manager into a `scan_cycle_tracker_t` struct
embedded on every PlcTaskCtx. Each task thread updates its own
tracker around its scan body — every task is timed independently,
not just the fastest one. The IO housekeeping window (plugin
cycle_start/cycle_end hooks) stays anchored on the fastest task
because plugins still assume one cross-cycle handoff per scan.
STATS response is now a JSON array:
STATS:{"tasks":[
{"name":"plc-task-0","scan_count":N,"scan_time_min":...},
{"name":"plc-task-1",...}
]}
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the cycle_start/cycle_end hook architecture for EtherCAT.
Each master now owns its own pthread, ticking absolutely at
master.cycle_time_us under SCHED_FIFO at task_priority (default 90,
above typical IEC task priorities so the bus exchange isn't starved
by a long PLC scan).
Per cycle the bus thread does the I/O exchange under split mutex
windows so IEC tasks aren't blocked across the network round-trip:
lock buffer_mutex
copy PLC outputs → IOmap (microseconds)
unlock buffer_mutex
SOEM exchange (tens-to-hundreds of us)
lock buffer_mutex
copy IOmap → PLC inputs (microseconds)
unlock buffer_mutex
Tracker registry on the runtime side: scan_cycle_manager exposes
register/unregister so any plugin can plumb its own periodic thread
into the STATS endpoint. The bus thread registers under
"ecat-<master>" so the editor renders its cycle timing alongside
the IEC tasks. Thunks are surfaced through plugin_runtime_args_t
so plugins don't link runtime symbols directly.
Drops:
- cycle_start/cycle_end plugin entry points (loader tolerates
their absence; logged "(optional)" already)
- tick_divisor / task_name / task_cycle_time_us config (replaced
by the dedicated thread driving cycle_time_us directly)
Adds: task_priority field on the master config (1-99, default 90).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ting
The tracker registry I added for the new EtherCAT bus thread was a
parallel pipe to a destination that already had a richer one. The
plugin's own status_snapshot — refreshed by the monitor thread and
published via /api/discovery/ethercat/{runtime-status,diagnostics}
— already exposes cycle_count, wkc_error_count, avg/max cycle_us,
avg/max exchange_us, per-slave AL state, recovery info. Far more
informative than my generic scan_cycle_tracker (5 cards).
Removes the parallel plumbing entirely:
- scan_cycle_tracker_register / _unregister + the registry array
+ the registry walk in format_timing_stats_response
- 7 plugin_runtime_args_t function pointers + the plugin_tracker_*
typedefs in plugin_types.h
- 7 thunk wrappers in plugin_driver.c
- bus_tracker / tracker_name fields on ecat_master_instance_t
- tracker_init/register/cleanup/unregister calls in
start_single_master / stop_single_master
- tracker_start / tracker_end inside the bus loop
Net delta: ~210 lines deleted across runtime + plugin.
Also fixes the bus-cycle stats accounting under the new thread model:
historically `inst->diag.total_ns = inst->diag.exchange_ns` because
the cycle_start hook had nothing else to time. Now the bus thread
holds the buffer mutex twice per cycle (output copy + input copy)
around the SOEM exchange. Time the full work window (start of first
mutex acquisition through end of second mutex release) as `total_ns`
while keeping `exchange_ns` scoped to just the SOEM round-trip. The
delta `total_ns - exchange_ns` now exposes IEC buffer-mutex
contention, which is the diagnostic that matters for an RT bus.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The existing diag fields (cycle_us, exchange_us) only reported
work-window duration — how long each cycle's mutex+exchange+mutex
actually takes. They didn't answer two questions a user running a
1ms bus cycle actually asks:
1. Are we hitting 1ms on average? (cycle period)
2. How late are we waking up? (scheduling latency)
Add scheduling stats captured by the bus thread itself, so they're
independent of the monitor thread's snapshot cadence:
period_ns: actual_wake[N] - actual_wake[N-1]; should equal
interval_ns on a healthy RT system. Spikes > interval
indicate scheduling jitter or preemption.
latency_ns: actual_wake[N] - expected_wake[N]; how much later
than the absolute clock_nanosleep deadline the thread
actually started running.
Tracked min / max / Welford-avg for both, on `inst->diag` and
mirrored into `ecat_master_status_t` as microsecond fields. Surface
in the JSON for both status (`metrics`) and diagnostics (`timing`)
endpoints — same key shape so the editor renders them through one
code path. Latency is signed (int64) to handle the rare case where
clock granularity puts us a fraction ahead of the deadline.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GCC build broke on three things in the bus-thread additions:
1. _GNU_SOURCE not defined → pthread_setname_np() and pthread_kill()
unavailable. Define before any system header (must be early).
2. <signal.h> not included → SIGUSR1, sigaction, struct sigaction
unresolved. Add the header.
3. ecat_bus_thread defined below start_single_master but referenced
by pthread_create() above it. Add a forward declaration near the
other prototypes.
No behavioural change — same code, just compiles now.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Brings in 28 EtherCAT improvements from development:
- Consolidated `ecat_iface_state` module (NIC tuning + IP isolation)
- Lock-free atomic cycle diag with EWMA averaging
- Strict SDO parsing/validation; safe_close on stop
- Cycle-keep-alive during OP recovery; mailbox drain
- Tests for config validator/parser/iface validator/proc
Conflicts resolved to keep the strucpp-impl architecture on top of
dev's new infrastructure:
- `ecat_master_config_t`: keep `task_priority` (no synthetic IEC
task on strucpp), keep dev's `safe_close`, drop `task_name` /
`task_cycle_time_us`.
- `ecat_master_status_t`: dropped — dev removed the snapshot struct
and reads counters directly from atomics on the instance.
- `ecat_cycle_diag_t`: dev's atomic EWMA shape extended with
`period_ns` / `latency_ns` (and min/max/avg variants) so the bus
thread can answer "are we hitting the configured cycle, and how
late are we waking up?" lock-free, same EWMA shift as bus_cycle_ns.
- `ecat_master_instance_t`: keep `bus_thread` + `bus_running`
(dedicated SCHED_FIFO bus thread is the strucpp architecture);
use dev's `iface_state` field; drop the inline iface_iptables_added
/ iface_ipv6_disabled_by_us flags now owned by `ecat_iface_state_t`.
- `ecat_run_one_cycle`: switch to dev's trylock(soem_lock) +
PRIO_INHERIT model; old non-atomic Welford diag block removed in
favour of the atomic EWMA already in place.
- `ecat_bus_thread`: rewritten to use atomics + EWMA for period /
latency stats (matching the rest of the diag struct).
- `ecat_diag_view_t`: extended with the period/latency fields so
both `status` and `diagnostics` JSON builders surface them.
- `diag_reset`: also seeds `min_period_ns` / `min_latency_ns`.
- Removed in-cycle periodic log (dev: "move PLC logging out of the
hot path"); the existing kept stats reach the editor through
/api/discovery/ethercat/{runtime-status,diagnostics}.
Tests:
- `test_ethercat_config_helpers.c`: assert `task_priority == 90`
and `safe_close == true` instead of the removed task_name fields.
Iface isolation (iptables/IPv6 sysctl) still ships in this merge via
the new module; it'll be removed in the follow-up merge of
`fix/ethercat-iface-isolation`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drops the iptables INPUT DROP rule and IPv6 sysctl manipulation from ecat_iface_state_apply / ecat_iface_state_revert. Those rules were killing single-NIC boards (Pi 4) the moment the plugin started — the management interface was the same NIC carrying the EtherCAT traffic. NIC tuning that was actually delivering the jitter wins is kept: ethtool coalescing (rx-usecs/tx-usecs=0), offload disables (GRO/GSO/ TSO off), and SO_BUSY_POLL on the SOEM AF_PACKET socket. Auto-resolved cleanly — no conflicts. iface_state header now carries NIC-tuning fields only; the iptables/ipv6 helpers and their callers are gone end-to-end. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The runtime was synthesising every task name as `plc-task-N`, hiding the actual IEC task names in the editor's stats panel. STruC++ already emits the IEC task name into TaskInstance::name (codegen.ts:2379), so just consume it. Falls back to the synthetic name only when the .so doesn't expose one — keeps older builds working. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two issues fixed at once: 1. The IEC scan tracker was on integer Welford (avg += (sample - avg) / scan_count). At 1 ms cycles this stalls silently after ~10⁷ cycles when the increment rounds to zero. At typical 50-100 ms IEC cycles the same bug still exists, just takes ~2 weeks to bite. 2. The EtherCAT diag was on a fixed-shift EWMA with a 32-sample window (~32 ms at 1 ms cycle). Editor polls at 2 s, so consecutive polls sampled fundamentally decorrelated averages — the displayed avg visibly jumped between min and max. Both now use: sum += sample - sum / N (per-cycle update) avg = sum / N (snapshot read) with N derived from the configured cycle period so the smoothing window stays at ~2 s of wall-clock regardless of cycle rate. IEC trackers compute N at scan_cycle_tracker_init from interval_ns; the EtherCAT master computes N at start_single_master from cycle_time_us and stores it on `inst->avg_window`. Sum form (rather than `avg += (sample - avg)/N`) avoids the integer-precision stall: increments to the sum are full-magnitude deltas, never rounded to zero. The cycle_diag fields rename from `avg_*_ns` to `avg_*_ns_sum` to make the new semantics explicit; only ethercat_plugin.c touches them so the rename is contained. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Picks up #127: drop migrate_legacy_nic — the legacy NIC state file never shipped in any v4.0.x release (only v4.1.0-rc.1) so the migration target doesn't exist on any installed runtime. 70 lines removed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reconciles the VPP plugin-package work onto the strucpp runtime base.
Both feature sets are preserved end-to-end; no shortcut implementations.
Key resolution decisions:
- plugin_types.h: keep both the strucpp debugger-surface function
pointers (debug_array_count/elem_count/size/read/set/write) and
VPP's plugin-initiated PLC stop (request_plc_stop). VPP's
common_ticktime_ns merged into strucpp's renamed base_tick_ns
(same semantic, single name).
- plugin_driver.c: keep all five strucpp debug-thunk wrappers and
add VPP's plugin_request_plc_stop wrapper. Args struct populated
with both new fields.
- plc_state_manager: VPP's edits to plc_state_manager.c don't apply
— strucpp turned that file into .cpp and rewrote the lifecycle.
The behavioural intent VPP encoded (wipe stats on PLC start /
unload) is automatic in strucpp's per-task model: plc_tasks[] is
allocated fresh per program run with zeroed trackers and freed at
the end of plc_cycle_thread, so stale stats can't leak across
runs. The explicit scan_cycle_stats_reset() function and
matching scan_cycle_manager_init/cleanup helpers VPP added are
dead in this model and were removed (not no-op'd) along with
their plc_main.c call sites.
- plcapp_management.py: take strucpp's correct status computation
(combined compile_ok && cleanup_ok) AND VPP's
update_plugin_configurations finalization. Drop strucpp's
auto-start block — VPP's "editor controls START" model is more
flexible (handles COMMAND:BUSY retries, lets the editor decide
whether to restart at all). _wait_for_plc_state helper became
dead code and was removed.
- compile.sh: keep strucpp's `make -j$(nproc) -f Makefile.strucpp`
for the program build; append VPP's vpp_plugin/ compile step
after it (with checksum cache for incremental builds and stale-
plugin cleanup when a project no longer has VPP source).
- ethercat_plugin.c: VPP moved cJSON.{c,h} from each plugin's
own cjson/ subdir to a shared native/cjson/ directory and updated
every native plugin's include path. Auto-merged cleanly;
CMakeLists already point at the shared copy.
- plugins.conf: deleted (VPP's own commit untracked it as
runtime-mutable; strucpp had no opinion on this).
VPP features preserved: plugin reload-on-config-update, get_stats
export, plugin-requested PLC stop, on-target VPP plugin compilation
with checksum cache, plugins.conf finalization after compile,
shared cJSON utility.
Strucpp features preserved: per-task scan-cycle trackers, debugger
surface (force/unforce/read/soft-write), Makefile-based parallel
+ ccache program build, base_tick_ns rename, time-based EWMA
averages, dedicated EtherCAT bus thread, runtime v4 .so contract.
NOT YET TESTED end-to-end on hardware — that's the user's next step.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Hardens task lifecycle and the new STruC++ debugger ABI; rebuilds plugin reload around name-matching instead of slot-positional dlclose; replaces the build-breaking MatIEC-era test stub with new debugger ABI tests. Critical - Partial-spawn rollback in plc_cycle_thread: if pthread_create fails for task i>0, signal+join 0..i-1 and free plc_tasks before bailing so the next load doesn't dereference orphaned threads. - plugin_driver_update_config: tear down each old slot by its CURRENT stored type (Python vs Native) before rebuilding from configs[]. The old slot-positional dlclose leaked native handles when a slot's type changed and force-freed Python state on type swap. - STATS race: plc_tasks_lock brackets alloc/free in the cycle thread and the format_timing_stats_response iteration. is_transitioning doesn't bracket an in-flight STATS call, so a plugin-initiated stop could free plc_tasks underneath the unix-socket reader. - Test stub: drop the dead plugin_utils.h include and the removed get_var_list/get_var_size/get_var_count stubs that were preventing the C test target from compiling. High - plugin_driver_init now sets `initialized` per slot; new plugin_driver_cleanup_init mirror rolls those back in reverse order. load_plc_program calls it on init failure or post-init pthread_create failure so a retry doesn't double-init plugin state. - plugin_driver_init failure now propagates: load_plc_program drops to ERROR instead of silently spawning the cycle thread on top of a half-failed init. - plugin_driver_update_config returns -1 if any ENABLED native plugin fails native_plugin_get_symbols; previously this was a warn and the plugin loaded as a silent no-op. - teardown_plugin_instance refuses to dlclose a still-running plugin — defense against a future caller pulling the .so out from under live function pointers. - EtherCAT bus thread no longer re-installs SIGUSR1; one process-wide handler is set in plc_main.c. sigaction is process-wide so multiple bus threads / task threads racing to register were last-writer-wins and a future divergence between handlers would silently lose half the time. Medium - holding_mutex flag is set AFTER pthread_mutex_lock returns; setting it before meant a signal landing during lock acquisition would have the crash handler unlock a mutex this thread never owned (UB on PI mutexes). - EtherCAT latency_ns = signed(actual_wake) - signed(expected) so an early wake doesn't depend on impl-defined unsigned→signed conversion. - Debug handler validates `arr` against ext_strucpp_debug_array_count() at FC 0x42/0x43/0x44 entry. Defense-in-depth: the editor's codegen validates inside the .so, but a malformed .so could OOB-read its internal table. - Debug FC 0x45 caps the MD5 read at 32 bytes regardless of termination — bounded info disclosure if a .so exports a non-null- terminated strucpp_program_md5. - plc_begin_transition uses an atomic CAS on is_transitioning and re-checks plc_get_state() under the gate, fixing both the check-then-call race in plugin_request_plc_stop and the missing rate-limit (concurrent calls collapse to one worker). - update_plugin_configurations wrapped in try/except so an exception there flips status to FAILED instead of leaving it pinned at COMPILING (editor would otherwise poll forever). - VPP plugin .so output moved to build/vpp/ subdir so the cleanup glob can't accidentally match a future built-in plugin .so dropped into build/ directly. Tests - tests/test_debug_handler.c — 16 tests across FC 0x41-0x45 wire format: array layout, force/unforce, contiguous range read, cross-array batch, MD5 echo, OOB arr rejection at the runtime gate (issue #17), unbounded-string MD5 mitigation (issue #18), unknown FC, short frames, status passthrough. - tests/test_scan_cycle_tracker.c — 15 tests covering tracker init, first-cycle seeding, EWMA recovery at avg_window=1, overruns counter, NULL safety. - tests/support/debug_handler_mocks.{c,h} — controllable fakes for the ext_strucpp_debug_* function pointers and the program-MD5 char*. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The MD5 response trailer was byte-copying the editor-supplied probe bytes back unchanged, so the editor could never observe target byte order from the response — it always got back exactly what it sent. The contract is now: trailer carries a runtime-driven sentinel, not an echo. Storing the literal 0xDEAD through a native `uint16_t*` makes the two bytes that land in the frame reflect the runtime's byte order: - LE runtime → trailer = [0xAD, 0xDE] - BE runtime → trailer = [0xDE, 0xAD] The editor uses that signal to decide whether subsequent force / read traffic needs byte-swapping at its end. Variable-data path on this side stays pure memcpy via the shared strucpp dispatcher — no server-side adaptation, matching the existing design. Paired with the editor change at openplc-editor commit efcf2499.
VPP plugins are now fully owned by the editor and managed through a
separate vpp_plugins.conf file, independent of the built-in
plugins.conf that the runtime owns.
Editor (program upload with VPP target)
└─ generates vpp_plugins.conf (name, .so path, config path)
└─ generates conf/<name>.json (plugin config payload)
Runtime (on upload receipt)
├─ apply_vpp_plugin_conf(): copies vpp_plugins.conf to runtime root
│ and copies conf/<name>.json → build/vpp/<name>.json
│ If the upload has NO vpp_plugins.conf, deletes any existing one
│ so stale VPP drivers from a previous project are never loaded.
└─ update_plugin_configurations(): unchanged, handles built-ins only
(VPP section removed)
Runtime (at PLC start, plugin_driver.c)
├─ plugin_driver_update_config("plugins.conf") — built-ins
└─ plugin_driver_append_config("vpp_plugins.conf") — VPP (absent = ok)
The gate is simply the presence of vpp_plugins.conf on disk. No glob,
no stale-path reconciliation, no built-in/VPP path guessing. Each
upload is authoritative about which VPP runs.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
apply_vpp_plugin_conf() was copying conf/<name>.json to a hardcoded build/vpp/<name>.json destination. Since vpp_plugins.conf already declares the exact config_path the .so will read at runtime, use that field directly — it is the single source of truth. Added a path traversal guard for defence in depth. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
teardown_plugin_instance (added in dcbbc48) calls python_plugin_cleanup for Python slots, which invokes Py_XDECREF and PyObject_CallFunctionObjArgs — both require the GIL. plc_main releases the GIL after initial plugin init, so the second update_config call from load_plc_program lands here without it and SIGSEGVs the moment the first Python plugin (modbus_slave) is torn down. Manifests as the runtime dying silently right after "PLC State: INIT" log, with safe-mode kicking in after three crashes. Two changes: 1. plugin_driver_update_config acquires the GIL with PyGILState_Ensure around the teardown + rebuild window when Python is initialized. teardown_plugin_instance needs it explicitly; python_plugin_get_symbols handles cold-start itself (Py_Initialize implicitly puts the calling thread in possession of the GIL). 2. The rebuild loop now reloads Python plugin symbols via python_plugin_get_symbols. teardown set plugin->python_plugin = NULL, so without this rebuild plugin_driver_init's Python branch (which requires plugin->python_plugin && pFuncInit) would silently skip every Python plugin on the second invocation — modbus_slave / modbus_master / opcua would never re-init after a PLC restart even with the crash fixed. 3. plugin_driver_load_config reduced to a thin wrapper around update_config. The previous post-update_config loop re-called *get_symbols on already-loaded slots — now that update_config loads both Python and native symbols, the second call leaked the bundle update_config just allocated.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This release reconciles two long-running feature branches into a single line targeting v4.1.0 (release candidate 3):
.vppplugin packages built viaopenplc-packages, so custom hardware support ships separately from the runtime.49 non-merge commits, 62 files, +4213 −7076 (net deletion — MatIEC carried a lot of weight).
Headline features
STruC++ compiler
A complete replacement for the MatIEC ST → C path:
.socontract (runtime v4): generated programs export a STruC++ ABI. The runtime'splugin_driverexposes the debugger surface —debug_array_count,debug_elem_count,debug_size,debug_read,debug_set(force/unforce),debug_write(soft write that respects existing forces) — directly to plugins. No more flat-indexget_var_*API.plc_state_manager.cppnow spawns one thread per IEC task with SCHED_FIFO and absoluteclock_nanosleepdeadlines. Each task owns its ownscan_cycle_tracker_t; the STATS endpoint reports per-task min/max/avg.plc-task-Nplaceholders.scripts/Makefile.strucpplets the on-target build saturate available cores; ccache reuses unchanged POU.ofiles automatically.Vendor Plugin Packages (VPP)
Custom hardware boards now ship as
.vpppackages instead of being baked into the runtime:plugin_driverre-dlopen()s native plugin.sos when the config changes (so user-uploaded VPP plugins refresh without a runtime restart).get_stats— plugins publish opaque metrics that the STATS endpoint splices into the response (rendered by the editor as a per-plugin card grid).request_plc_stop(reason)— fieldbus plugins detecting unrecoverable hardware faults can request an async whole-PLC stop through the same transition path as a unix-socket STOP. Non-blocking, logged at error.compile.sh: the editor uploadsvpp_plugin/source alongside the IEC program; the runtime compiles it after the program build. Checksum cache (vpp_plugin/checksum.sha256) skips rebuild when source hasn't changed.plugins.conffinalisation moved post-compile so on-target-built VPP plugin entries get registered correctly.cJSONutility: every native plugin links againstcore/src/drivers/plugins/native/cjson/cJSON.{c,h}instead of carrying its own copy.Other notable changes
COMMAND:BUSY.Breaking changes
plugin_runtime_args_tadds debugger thunks (debug_array_countetc.),request_plc_stop, and renamescommon_ticktime_ns→base_tick_ns. Existing native plugins (Modbus, OPC-UA, S7Comm, EtherCAT) updated; out-of-tree plugins need to recompile.scripts/compile.shnow invokesmake -f scripts/Makefile.strucpp. MatIEC-era files (Config0.c,glueVars.c) are rejected with a clear error.get_var_list/get_var_count/get_var_sizeno longer exist. Plugins resolve variables via the new(arr, elem)tuples written into their per-plugin config by the editor at compile time.File scope
core/src/drivers/plugin_*.{c,h},plugin_types.h— ABI extensionscore/src/drivers/plugins/native/{ethercat,opcua,modbus,s7comm,...}/— adapted to the new debugger surface and shared cJSONcore/src/plc_app/plc_state_manager.{cpp,h}— C → C++ rewrite, per-task threadingcore/src/plc_app/scan_cycle_manager.{c,h}— per-task tracker, EWMA averagingcore/src/plc_app/unix_socket.{c,h},plc_main.c— STATS multi-task / plugin_stats splicecore/src/lib/strucpp_abi.hpp,core/strucpp_runtime/— STruC++ ABI mirror + runtime shimscripts/{compile.sh,Makefile.strucpp}— new build pipelinewebserver/plcapp_management.py— combined-result status, post-compile plugin config finalisationtests/test_ethercat_*— coverage expandedTest plan
tasks[]array and EtherCAT plugin'splugin_statsCOMMAND:BUSYretries from the editor sideAfter merge
Delete
feat/vpp-plugin-supportandstrucpp-impl— their work is fully captured here.🤖 Generated with Claude Code