Skip to content

Re-send webhook on peer disconnect with orphaned HTLCs#8

Open
amackillop wants to merge 2 commits intolsp-0.2.0from
amackillop_resend-webhook-on-orphaned-htlc
Open

Re-send webhook on peer disconnect with orphaned HTLCs#8
amackillop wants to merge 2 commits intolsp-0.2.0from
amackillop_resend-webhook-on-orphaned-htlc

Conversation

@amackillop
Copy link

When a serverless SDK peer reconnects from a webhook but disconnects again before channel_reestablish completes, the intercepted HTLC gets stuck: process_pending_htlcs only iterates connected peers, so a disconnected peer's HTLCs just sit there until the 45s expiry timer kills them.

Fix this by firing another SendWebhook from peer_disconnected when orphaned HTLCs remain in the store. The peer wakes up, reconnects, and peer_connected / process_pending_htlcs get another shot at forwarding. If it disconnects again, the cycle repeats until either the forward succeeds or handle_expired_htlcs cleans up at 45s.

When a serverless SDK peer reconnects from a webhook but disconnects
again before channel_reestablish completes, the intercepted HTLC gets
stuck: process_pending_htlcs only iterates connected peers, so a
disconnected peer's HTLCs just sit there until the 45s expiry timer
kills them.

Fix this by firing another SendWebhook from peer_disconnected when
orphaned HTLCs remain in the store. The peer wakes up, reconnects,
and peer_connected / process_pending_htlcs get another shot at
forwarding. If it disconnects again, the cycle repeats until either
the forward succeeds or handle_expired_htlcs cleans up at 45s.
The TOCTOU fix in eecf53e made process_pending_htlcs skip whenever
calculate_htlc_actions_for_peer returned new_channel_needed_msat,
assuming htlc_intercepted or peer_connected had already emitted
OpenChannel. That assumption is wrong for the disconnected-peer
path: htlc_intercepted only stores the HTLC and sends a webhook,
peer_connected defers because channels aren't usable yet, and
nobody ever emits OpenChannel. The timer was the only place that
could, but the skip prevented it. The HTLC sat there until expiry.

Replace the unconditional skip with a pending_channel_opens set
that tracks which peers actually have an OpenChannel in flight.
execute_htlc_actions inserts, channel_ready removes. The timer
only skips if the peer is in the set.

Considered making the OpenChannel handler in ldk-node idempotent
instead, but create_channel has real side effects (funding tx) and
deduplicating at that layer would be more invasive for the same
result.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant