Replace wal_sender_timeout-based liveness with TCP keepalive.#373
Replace wal_sender_timeout-based liveness with TCP keepalive.#373ibrarahmad wants to merge 1 commit intomainfrom
Conversation
The apply worker previously relied on wal_sender_timeout as both a server-side disconnect trigger and an indirect keepalive pressure on the subscriber. This caused spurious disconnects in two scenarios: a flood of 'w' messages keeping the subscriber too busy to send 'r' feedback in time, and large transactions whose apply time exceeded wal_sender_timeout. The workaround was maybe_send_feedback(), which force-sent 'r' after every 10 'w' messages or wal_sender_timeout/2, whichever came first. This was a fragile band-aid that coupled subscriber behavior to a server GUC it cannot control. Replace the entire mechanism with a clean two-layer model: - TCP keepalive (keepalives_idle=10, keepalives_interval=5, keepalives_count=3) is the primary liveness detector on both sides. A dead network or crashed host is detected in ~25 seconds. - wal_sender_timeout=0 is set on replication connections so the walsender never disconnects due to missing 'r' feedback. Liveness on the server side is now handled entirely by TCP keepalive. - spock.apply_idle_timeout (default 300s) is a subscriber-side safety net for a hung-but-connected walsender whose TCP keepalive probes are answered by the kernel but sends no data. Set to 0 to disable. Fix a bug in last_receive_timestamp handling: it was updated unconditionally after every PQgetCopyData call, including when r==0 (no data available). Each 1-second WL_TIMEOUT spin silently reset the timer, making apply_idle_timeout never fire. Move the update to after the r==0 guard so it reflects actual data receipt only. Remove maybe_send_feedback() as it is no longer needed.
📝 WalkthroughWalkthroughA new idle timeout control mechanism was introduced for apply workers with a configurable GUC parameter. Replication connection handling was extended to disable server-side wal_sender_timeout. Timeout logic in apply worker was refactored from keepalive-based to idle-timeout-based, with supporting infrastructure adjustments throughout the codebase. Changes
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/spock.c (1)
345-356: Consider making TCP keepalive parameters configurable.The hardcoded values (idle=10s, interval=5s, count=3) result in ~25s dead connection detection. While reasonable for most deployments, high-latency or unreliable network environments may experience false-positive disconnects.
Consider exposing these as GUCs (e.g.,
spock.keepalives_idle,spock.keepalives_interval,spock.keepalives_count) to allow tuning without code changes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/spock.c` around lines 345 - 356, Replace the hardcoded TCP keepalive literals in the keys/vals block with configurable GUC-backed values: define GUCs (e.g., spock.keepalives_idle, spock.keepalives_interval, spock.keepalives_count) as int variables (suggest names spock_keepalives_idle, spock_keepalives_interval, spock_keepalives_count) during module initialization (e.g., in _PG_init or the existing GUC registration area), register them with DefineCustomIntVariable, and then use those variables' stringified values when populating vals[] for the keys "keepalives_idle", "keepalives_interval", and "keepalives_count" in the code that sets keys[i]/vals[i]; keep the default values 10/5/3 and ensure bounds checking on the GUCs (positive integers) when registering.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/spock.c`:
- Around line 345-356: Replace the hardcoded TCP keepalive literals in the
keys/vals block with configurable GUC-backed values: define GUCs (e.g.,
spock.keepalives_idle, spock.keepalives_interval, spock.keepalives_count) as int
variables (suggest names spock_keepalives_idle, spock_keepalives_interval,
spock_keepalives_count) during module initialization (e.g., in _PG_init or the
existing GUC registration area), register them with DefineCustomIntVariable, and
then use those variables' stringified values when populating vals[] for the keys
"keepalives_idle", "keepalives_interval", and "keepalives_count" in the code
that sets keys[i]/vals[i]; keep the default values 10/5/3 and ensure bounds
checking on the GUCs (positive integers) when registering.
ℹ️ Review info
Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 69aec04d-1d81-4de7-913f-14b9f06f8761
📒 Files selected for processing (3)
include/spock.hsrc/spock.csrc/spock_apply.c
| * kernel ACKs them, but no data is being sent. | ||
| */ | ||
| if (rc & WL_TIMEOUT) | ||
| if (rc & WL_TIMEOUT && spock_apply_idle_timeout > 0) |
There was a problem hiding this comment.
It seems like if walsender just doesn't have data to send for a long time, subscriber will restart. Am I wrong?
It would be better to modify walsender little: skip keepalive messages being busy and rely on TCP status. But send keepalive messages if no data arrives from the WAL. In this case we don't need any subscriber-side GUC at all.
The apply worker previously relied on wal_sender_timeout as both a server-side disconnect trigger and an indirect keepalive pressure on the subscriber. This caused spurious disconnects in two scenarios: a flood of 'w' messages keeping the subscriber too busy to send 'r' feedback in time, and large transactions whose apply time exceeded wal_sender_timeout.
The workaround was maybe_send_feedback(), which force-sent 'r' after every 10 'w' messages or wal_sender_timeout/2, whichever came first. This was a fragile band-aid that coupled subscriber behavior to a server GUC it cannot control.
Replace the entire mechanism with a clean two-layer model:
TCP keepalive (keepalives_idle=10, keepalives_interval=5, keepalives_count=3) is the primary liveness detector on both sides. A dead network or crashed host is detected in ~25 seconds.
wal_sender_timeout=0 is set on replication connections so the walsender never disconnects due to missing 'r' feedback. Liveness on the server side is now handled entirely by TCP keepalive.
spock.apply_idle_timeout (default 300s) is a subscriber-side safety net for a hung-but-connected walsender whose TCP keepalive probes are answered by the kernel but sends no data. Set to 0 to disable.
Fix a bug in last_receive_timestamp handling: it was updated unconditionally after every PQgetCopyData call, including when r==0 (no data available). Each 1-second WL_TIMEOUT spin silently reset the timer, making apply_idle_timeout never fire. Move the update to after the r==0 guard so it reflects actual data receipt only.
Remove maybe_send_feedback() as it is no longer needed.