Replaced reorg detection with parent hash verification#224
Replaced reorg detection with parent hash verification#224yug49 wants to merge 4 commits intoOpenZeppelin:mainfrom
Conversation
|
Hey @yug49 thanks for contributing! |
|
Sure, it was my pleasure to contribute |
|
@yug49 could you resolve conflicts so we can review? |
2460912 to
6718eee
Compare
done |
| /// Information about a block stored in the ring buffer. | ||
| #[derive(Clone, Copy, Debug, PartialEq, Eq)] | ||
| pub(crate) struct BlockInfo<H> { | ||
| pub number: BlockNumber, | ||
| pub hash: H, | ||
| } | ||
|
|
There was a problem hiding this comment.
We can remove this struct and use this instead:
alloy::eips::BlockNumHash
| /// Truncates the buffer to remove all blocks after (and including) the given block number. | ||
| /// This is used when a reorg is detected to remove the reorged blocks. | ||
| pub fn truncate_from(&mut self, from_number: BlockNumber) { | ||
| self.inner.retain(|info| info.number < from_number); |
There was a problem hiding this comment.
I believe doing this is slightly more efficient
let idx = self.inner.partition_point(|info| info.number < from_number);
self.inner.truncate(idx);retain visit every element runs the predicate on each and for elements that stay it swaps them into position. However since we are in sorted order this should be better since we just dropping the tail
| // Check if this is a previously seen block (duplicate check) | ||
| // This happens when checking `previous_batch_end` to verify it wasn't reorged | ||
| if let Some(buffered) = self.buffer.find_by_number(block_number) { | ||
| if buffered.hash == block_hash { |
There was a problem hiding this comment.
I believe this is wrong / may never happen.
We are working under the assumption that if a new block comes with the same number it will have a different hash i..e the only way this happened is if a reorg occurred.
The buffer would be:
[100, 101, 102]
If block 102 comes in its hash should not exist in the buffer. In that case we'll need to do a parent hash i..e the normal flow of the reorg check.
If this is true we can perhaps remove verify_block_on_chain fn and this if branch.
There was a problem hiding this comment.
Maybe we can ignore this if we refactor this file
There was a problem hiding this comment.
I was halfway through reviewing this file but then i realised perhaps it may be easier to restart this file from scratch as i believe the initial requirements were not detailed enough.
Here is something i had in mind for this PR
Core Invariant
We store (block_number, block_hash) pairs in a ring buffer as blocks arrive.
Every incoming block's parent hash must be verified against our buffer.
The check function
check(incoming_block) -> ReorgAction {
// ── Case 0: Empty buffer ──────────────────────────────────────
if buffer.is_empty() {
buffer.push(incoming_block.number, incoming_block.hash)
return NoReorg
}
let buffer_front = buffer.front() // oldest entry
let buffer_back = buffer.back() // newest entry
// ── Next expected block (happy path) ──────────────────
if incoming_block.number == buffer_back.number + 1
&& incoming_block.parent_hash == buffer_back.hash {
buffer.push(incoming_block.number, incoming_block.hash)
return NoReorg
}
// ── Scenario 1 and 2: Block within buffer range (rewind to fork point) ─
// The stream rewound to a block we already have stored.
// Covers Scenario 1 (stream rewinds correctly) AND
// Scenario 2 (stream returns a block past the actual fork).
if incoming_block.number >= buffer_front.number
&& incoming_block.number <= buffer_back.number {
return handle_reorg_within_buffer(incoming_block)
}
// ── Scenario 3: Block beyond buffer head (gap) ────────────────────
// The stream jumped ahead. A reorg may or may not have
// happened in the gap — we don't care, the canonical chain
// post-gap is what matters.
if incoming_block.number > buffer_back.number + 1 {
return handle_gap(incoming_block)
}
// ── Edge Case: Block before buffer range ─────────────────────────
// The stream rewound further than our buffer covers.
// We can't verify ancestry — must reset.
if incoming_block.number < buffer_front.number {
return handle_deep_reorg(incoming_block)
}
}Scenario 1 — Stream rewinds to the fork point
Buffer: [1, 2, 3] (hashes: H1, H2, H3)
Stream: 1 → 2 → 3 → REORG → 2'
Timeline:
Buffer state: [ H1 | H2 | H3 ]
^ ^
front back
Incoming block 2' arrives (number=2)
→ number is within buffer range
→ look up block 1 in buffer → has hash H1
→ verify: incoming_block.parent_hash == H1? ✅
→ fork point = block 1
→ truncate buffer to [1], then push 2'
Buffer after: [ H1 | H2' ]
handle_reorg_within_buffer(incoming_block) {
// Walk backwards from the incoming block's position to find
// the fork point — the deepest block whose hash still matches.
let fork_number = incoming_block.number - 1
// Check if parent is in our buffer
if let Some(stored) = buffer.get(fork_number) {
if incoming_block.parent_hash == stored.hash {
// Fork point found — the parent is still valid
let fork_point = fork_number
// Truncate buffer: remove everything after fork_point
buffer.truncate_after(fork_point)
// Push the new block
buffer.push(incoming_block.number, incoming_block.hash)
return Reorg {
fork_block: fork_point,
reprocess_from: incoming_block.number,
}
}
}
// Parent hash doesn't match what's in our buffer.
// The stream gave us a block past the actual fork point.
// We need to walk back further — go to Scenario 2.
return walk_back_to_find_fork(incoming_block)
}Scenario 2 — Stream returns a block past the fork point
Buffer: [1, 2, 3] (hashes: H1, H2, H3)
Stream: 1 → 2 → 3 → (2 reorged) → 3'
Timeline:
Incoming block 3' arrives (number=3)
→ number is within buffer range
→ look up block 2 in buffer → has hash H2
→ verify: 3'.parent_hash == H2? ❌ (block 2 was also reorged)
Need to walk backwards via RPC:
→ fetch block 2' from node (at number=2)
→ check: 2'.parent_hash == H1 (our stored block 1)? ✅
→ fork point = block 1
Buffer after: [ H1 | H2' | H3' ]
walk_back_to_find_fork(incoming_block) {
// The incoming block's parent doesn't match our buffer.
// Fetch the actual parent from the node and keep walking
// back until we find a common ancestor.
let mut current = incoming_block
let mut new_blocks = vec![incoming_block] // blocks to replay
loop {
let parent_number = current.number - 1
// If we've walked past our buffer, we can't verify further
if parent_number < buffer.front().number {
return handle_deep_reorg(incoming_block)
}
// Fetch the parent block from the node
let parent_block = rpc.get_block(parent_number)
new_blocks.push(parent_block)
// Check if THIS block's parent matches our buffer
let grandparent_number = parent_block.number - 1
if let Some(stored) = buffer.get(grandparent_number) {
if parent_block.parent_hash == stored.hash {
// ✅ Found the fork point
let fork_point = grandparent_number
buffer.truncate_after(fork_point)
// Push all new blocks in order
// May need new method in ring buffer to push vec.
buffer.push(new_blocks)
return Reorg {
fork_block: fork_point
}
}
}
current = parent_block
}
}Scenario 3 — Stream skips a range (gap)
Buffer: [1, 2, 3] (hashes: H1, H2, H3)
Stream: 1 → 2 → 3 → (6 reorged) → 8
Timeline:
Incoming block 8 arrives (number=8)
→ number > buffer_back + 1 (gap detected: 4, 5, 6, 7 missing)
→ We don't know/care if a reorg happened mid-gap
→ The canonical chain is what matters now
Fill the gap:
→ fetch blocks 4, 5, 6, 7 from the node
→ verify chain continuity:
block 4.parent_hash == H3 (our stored block 3)? ✅
block 5.parent_hash == H4? ✅ (just fetched)
block 6.parent_hash == H5? ✅
block 7.parent_hash == H6? ✅
block 8.parent_hash == H7? ✅
→ push all into buffer
Buffer after: [ ..., H3, H4, H5, H6, H7, H8 ]
(ring buffer may have evicted oldest entries)
handle_gap(incoming_block) {
let gap_start = buffer.back().number + 1
let gap_end = incoming_block.number // inclusive
// Fetch all missing blocks
let gap_blocks = rpc.get_blocks(gap_start..gap_end)
// Verify the first gap block connects to our buffer
if gap_blocks[0].parent_hash != buffer.back().hash {
// A reorg happened that affected our buffer tail.
// Treat the first gap block as a reorg block and
// use the within-buffer logic.
return handle_reorg_within_buffer(gap_blocks[0])
// Then re-enter handle_gap for the rest... or just
// process all gap_blocks sequentially through check().
}
// Verify internal chain continuity
for i in 1..gap_blocks.len() {
assert!(gap_blocks[i].parent_hash == gap_blocks[i-1].hash)
}
// Verify incoming block connects
assert!(incoming_block.parent_hash == gap_blocks.last().hash)
// Push everything into buffer
for block in gap_blocks {
buffer.push(block.number, block.hash)
}
buffer.push(incoming_block.number, incoming_block.hash)
return NoReorg
}Scenario 4 (edge case) — Deep reorg past buffer range
Buffer: [50, 51, 52] (buffer only holds N blocks)
Stream: ... → 52 → REORG → 48
Timeline:
Incoming block 48 arrives (number=48)
→ number < buffer_front (50)
→ We have no stored hash for block 47 to verify against
→ Cannot determine fork point from buffer alone
Options:
a) Reset scanner from block 48, rebuild buffer via RPC
b) Increase buffer size to cover expected reorg depths
handle_deep_reorg(incoming_block) {
// We can't verify ancestry — our buffer doesn't go back far enough.
// Fall back to the latest finalized block as a safe anchor.
let finalized = rpc.get_finalized_block()
buffer.clear()
buffer.push(finalized.number, finalized.hash)
return DeepReorg {
restart_from: finalized.number,
}
}Summary
block == back + 1and parent matches → Happy path → Push to bufferblockwithin buffer range, parent matches → Scenario 1 (clean rewind) → Truncate + pushblockwithin buffer range, parent mismatch → Scenario 2 (past fork) → Walk back via RPCblock > back + 1(gap) → Scenario 3 (skip) → Fetch gap, verify chainblock < front→ Scenario 4 (deep reorg) → Reset buffer and get from finalized- Buffer empty → Init → Push first block
Possible missing cases to consider
- Duplicate block: incoming block has the same number AND hash as buffer back treat as no-op
- Gap + reorg in gap: Scenario 3 where block 4's parent doesn't match our block 3. This means a reorg affected our buffer range too, need to fall through to Scenario 2 logic.
- Multiple consecutive reorgs: two reorgs in quick succession before the scanner catches up. The logic should be re-entrant, each call to
checkhandles one reorg at a time.
Hopefully this makes it clearer!
There was a problem hiding this comment.
Yes, ty
This is much clearer now, I will move forward accordingly
| /// Returns true if the buffer is empty. | ||
| pub fn is_empty(&self) -> bool { | ||
| self.inner.is_empty() | ||
| } |
There was a problem hiding this comment.
Maybe as mentioned in the https://github.com/OpenZeppelin/Event-Scanner/pull/224/changes#r2852317356
We should add append, in cases where we need to fill a gap
Resolves #214
Description
Refactors reorg detection from on-chain hash existence checks to parent hash verification against a ring buffer.
Changes
ring_buffer.rs: AddedBlockInfo<H>struct storing block number + hash, withfind_by_number()andtruncate_from()methodsreorg_handler.rs: Replaced hash existence checks with parent hash verification; addedhandle_gap()for intermediate block fetchingScenarios
Edge cases handled