Skip to content

Replaced reorg detection with parent hash verification#224

Open
yug49 wants to merge 4 commits intoOpenZeppelin:mainfrom
yug49:refactor/reorg-detection
Open

Replaced reorg detection with parent hash verification#224
yug49 wants to merge 4 commits intoOpenZeppelin:mainfrom
yug49:refactor/reorg-detection

Conversation

@yug49
Copy link
Contributor

@yug49 yug49 commented Dec 4, 2025

Resolves #214

Description

Refactors reorg detection from on-chain hash existence checks to parent hash verification against a ring buffer.

Changes

  • ring_buffer.rs: Added BlockInfo<H> struct storing block number + hash, with find_by_number() and truncate_from() methods
  • reorg_handler.rs: Replaced hash existence checks with parent hash verification; added handle_gap() for intermediate block fetching

Scenarios

Scenario Before After
Direct reorg (1→2→3→REORG→2) 3 RPC calls 0 RPC calls
Block past fork (1→2→3→(2 reorged)→3) 3 RPC calls 1 RPC call
Gap with reorg (1→2→3→(6 reorged)→8) 1 RPC call 5 RPC calls (but maintains fuller buffer)

Edge cases handled

  • Empty buffer (first block)
  • Reorg deeper than buffer size → fallback to finalized block
  • Gaps in block numbers → fetch intermediate blocks

@0xNeshi
Copy link
Collaborator

0xNeshi commented Dec 5, 2025

Hey @yug49 thanks for contributing!
Our team is currently working though some priority items, so we'll get to reviewing this as soon as our schedule allows.
We appreciate your patience and the contribution!

@yug49
Copy link
Contributor Author

yug49 commented Dec 5, 2025

Sure, it was my pleasure to contribute

@LeoPatOZ LeoPatOZ mentioned this pull request Jan 27, 2026
@0xNeshi
Copy link
Collaborator

0xNeshi commented Feb 4, 2026

@yug49 could you resolve conflicts so we can review?

@yug49 yug49 force-pushed the refactor/reorg-detection branch from 2460912 to 6718eee Compare February 13, 2026 05:06
@yug49
Copy link
Contributor Author

yug49 commented Feb 13, 2026

@yug49 could you resolve conflicts so we can review?

done

Comment on lines +33 to +39
/// Information about a block stored in the ring buffer.
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub(crate) struct BlockInfo<H> {
pub number: BlockNumber,
pub hash: H,
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove this struct and use this instead:

alloy::eips::BlockNumHash

/// Truncates the buffer to remove all blocks after (and including) the given block number.
/// This is used when a reorg is detected to remove the reorged blocks.
pub fn truncate_from(&mut self, from_number: BlockNumber) {
self.inner.retain(|info| info.number < from_number);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe doing this is slightly more efficient

let idx = self.inner.partition_point(|info| info.number < from_number);
self.inner.truncate(idx);

retain visit every element runs the predicate on each and for elements that stay it swaps them into position. However since we are in sorted order this should be better since we just dropping the tail

// Check if this is a previously seen block (duplicate check)
// This happens when checking `previous_batch_end` to verify it wasn't reorged
if let Some(buffered) = self.buffer.find_by_number(block_number) {
if buffered.hash == block_hash {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is wrong / may never happen.

We are working under the assumption that if a new block comes with the same number it will have a different hash i..e the only way this happened is if a reorg occurred.

The buffer would be:

[100, 101, 102]

If block 102 comes in its hash should not exist in the buffer. In that case we'll need to do a parent hash i..e the normal flow of the reorg check.

If this is true we can perhaps remove verify_block_on_chain fn and this if branch.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can ignore this if we refactor this file

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was halfway through reviewing this file but then i realised perhaps it may be easier to restart this file from scratch as i believe the initial requirements were not detailed enough.

Here is something i had in mind for this PR

Image

Core Invariant

We store (block_number, block_hash) pairs in a ring buffer as blocks arrive.
Every incoming block's parent hash must be verified against our buffer.


The check function

check(incoming_block) -> ReorgAction {

    // ── Case 0: Empty buffer ──────────────────────────────────────
    if buffer.is_empty() {
        buffer.push(incoming_block.number, incoming_block.hash)
        return NoReorg
    }

    let buffer_front = buffer.front()  // oldest entry
    let buffer_back  = buffer.back()   // newest entry

    // ──  Next expected block (happy path) ──────────────────
    if incoming_block.number == buffer_back.number + 1
       && incoming_block.parent_hash == buffer_back.hash {
        buffer.push(incoming_block.number, incoming_block.hash)
        return NoReorg
    }

    // ──  Scenario 1 and 2: Block within buffer range (rewind to fork point) ─
    //    The stream rewound to a block we already have stored.
    //    Covers Scenario 1 (stream rewinds correctly) AND
    //    Scenario 2 (stream returns a block past the actual fork).
    if incoming_block.number >= buffer_front.number
       && incoming_block.number <= buffer_back.number {

        return handle_reorg_within_buffer(incoming_block)
    }

    // ── Scenario 3: Block beyond buffer head (gap) ────────────────────
    //    The stream jumped ahead. A reorg may or may not have
    //    happened in the gap — we don't care, the canonical chain
    //    post-gap is what matters.
    if incoming_block.number > buffer_back.number + 1 {
        return handle_gap(incoming_block)
    }

    // ── Edge Case: Block before buffer range ─────────────────────────
    //    The stream rewound further than our buffer covers.
    //    We can't verify ancestry — must reset.
    if incoming_block.number < buffer_front.number {
        return handle_deep_reorg(incoming_block)
    }
}

Scenario 1 — Stream rewinds to the fork point

Buffer:  [1, 2, 3]          (hashes: H1, H2, H3)
Stream:  1 → 2 → 3 → REORG → 2'

Timeline:
  Buffer state:    [ H1 | H2 | H3 ]
                     ^         ^
                   front      back

  Incoming block 2' arrives (number=2)
    → number is within buffer range
    → look up block 1 in buffer → has hash H1
    → verify: incoming_block.parent_hash == H1? ✅
    → fork point = block 1
    → truncate buffer to [1], then push 2'

  Buffer after:    [ H1 | H2' ]
handle_reorg_within_buffer(incoming_block) {
    // Walk backwards from the incoming block's position to find
    // the fork point — the deepest block whose hash still matches.

    let fork_number = incoming_block.number - 1

    // Check if parent is in our buffer
    if let Some(stored) = buffer.get(fork_number) {
        if incoming_block.parent_hash == stored.hash {
            // Fork point found — the parent is still valid
            let fork_point = fork_number

            // Truncate buffer: remove everything after fork_point
            buffer.truncate_after(fork_point)

            // Push the new block
            buffer.push(incoming_block.number, incoming_block.hash)

            return Reorg {
                fork_block: fork_point,
                reprocess_from: incoming_block.number,
            }
        }
    }

    // Parent hash doesn't match what's in our buffer.
    // The stream gave us a block past the actual fork point.
    // We need to walk back further — go to Scenario 2.
    return walk_back_to_find_fork(incoming_block)
}

Scenario 2 — Stream returns a block past the fork point

Buffer:  [1, 2, 3]          (hashes: H1, H2, H3)
Stream:  1 → 2 → 3 → (2 reorged) → 3'

Timeline:
  Incoming block 3' arrives (number=3)
    → number is within buffer range
    → look up block 2 in buffer → has hash H2
    → verify: 3'.parent_hash == H2? ❌ (block 2 was also reorged)

  Need to walk backwards via RPC:
    → fetch block 2' from node (at number=2)
    → check: 2'.parent_hash == H1 (our stored block 1)? ✅
    → fork point = block 1

  Buffer after:    [ H1 | H2' | H3' ]
walk_back_to_find_fork(incoming_block) {
    // The incoming block's parent doesn't match our buffer.
    // Fetch the actual parent from the node and keep walking
    // back until we find a common ancestor.

    let mut current = incoming_block
    let mut new_blocks = vec![incoming_block]  // blocks to replay

    loop {
        let parent_number = current.number - 1

        // If we've walked past our buffer, we can't verify further
        if parent_number < buffer.front().number {
            return handle_deep_reorg(incoming_block)
        }

        // Fetch the parent block from the node
        let parent_block = rpc.get_block(parent_number)
        new_blocks.push(parent_block)

        // Check if THIS block's parent matches our buffer
        let grandparent_number = parent_block.number - 1
        if let Some(stored) = buffer.get(grandparent_number) {
            if parent_block.parent_hash == stored.hash {
                // ✅ Found the fork point
                let fork_point = grandparent_number

                buffer.truncate_after(fork_point)

                // Push all new blocks in order
                // May need new method in ring buffer to push vec.
                buffer.push(new_blocks)
                

                return Reorg {
                    fork_block: fork_point
                }
            }
        }

        current = parent_block
    }
}

Scenario 3 — Stream skips a range (gap)

Buffer:  [1, 2, 3]          (hashes: H1, H2, H3)
Stream:  1 → 2 → 3 → (6 reorged) → 8

Timeline:
  Incoming block 8 arrives (number=8)
    → number > buffer_back + 1 (gap detected: 4, 5, 6, 7 missing)
    → We don't know/care if a reorg happened mid-gap
    → The canonical chain is what matters now

  Fill the gap:
    → fetch blocks 4, 5, 6, 7 from the node
    → verify chain continuity:
        block 4.parent_hash == H3 (our stored block 3)? ✅
        block 5.parent_hash == H4? ✅  (just fetched)
        block 6.parent_hash == H5? ✅
        block 7.parent_hash == H6? ✅
        block 8.parent_hash == H7? ✅
    → push all into buffer

  Buffer after:    [ ..., H3, H4, H5, H6, H7, H8 ]
                   (ring buffer may have evicted oldest entries)
handle_gap(incoming_block) {
    let gap_start = buffer.back().number + 1
    let gap_end   = incoming_block.number  // inclusive

    // Fetch all missing blocks
    let gap_blocks = rpc.get_blocks(gap_start..gap_end)

    // Verify the first gap block connects to our buffer
    if gap_blocks[0].parent_hash != buffer.back().hash {
        // A reorg happened that affected our buffer tail.
        // Treat the first gap block as a reorg block and
        // use the within-buffer logic.
        return handle_reorg_within_buffer(gap_blocks[0])
        // Then re-enter handle_gap for the rest... or just
        // process all gap_blocks sequentially through check().
    }

    // Verify internal chain continuity
    for i in 1..gap_blocks.len() {
        assert!(gap_blocks[i].parent_hash == gap_blocks[i-1].hash)
    }

    // Verify incoming block connects
    assert!(incoming_block.parent_hash == gap_blocks.last().hash)

    // Push everything into buffer
    for block in gap_blocks {
        buffer.push(block.number, block.hash)
    }
    buffer.push(incoming_block.number, incoming_block.hash)

    return NoReorg 
}

Scenario 4 (edge case) — Deep reorg past buffer range

Buffer:  [50, 51, 52]       (buffer only holds N blocks)
Stream:  ... → 52 → REORG → 48

Timeline:
  Incoming block 48 arrives (number=48)
    → number < buffer_front (50)
    → We have no stored hash for block 47 to verify against
    → Cannot determine fork point from buffer alone

  Options:
    a) Reset scanner from block 48, rebuild buffer via RPC
    b) Increase buffer size to cover expected reorg depths
handle_deep_reorg(incoming_block) {
    // We can't verify ancestry — our buffer doesn't go back far enough.
    // Fall back to the latest finalized block as a safe anchor.

    let finalized = rpc.get_finalized_block()

    buffer.clear()
    buffer.push(finalized.number, finalized.hash)

    return DeepReorg {
        restart_from: finalized.number,
    }
}

Summary

  • block == back + 1 and parent matches → Happy path → Push to buffer
  • block within buffer range, parent matches → Scenario 1 (clean rewind) → Truncate + push
  • block within buffer range, parent mismatch → Scenario 2 (past fork) → Walk back via RPC
  • block > back + 1 (gap) → Scenario 3 (skip) → Fetch gap, verify chain
  • block < frontScenario 4 (deep reorg) → Reset buffer and get from finalized
  • Buffer empty → Init → Push first block

Possible missing cases to consider

  1. Duplicate block: incoming block has the same number AND hash as buffer back treat as no-op
  2. Gap + reorg in gap: Scenario 3 where block 4's parent doesn't match our block 3. This means a reorg affected our buffer range too, need to fall through to Scenario 2 logic.
  3. Multiple consecutive reorgs: two reorgs in quick succession before the scanner catches up. The logic should be re-entrant, each call to check handles one reorg at a time.

Hopefully this makes it clearer!

Copy link
Contributor Author

@yug49 yug49 Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ty
This is much clearer now, I will move forward accordingly

/// Returns true if the buffer is empty.
pub fn is_empty(&self) -> bool {
self.inner.is_empty()
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe as mentioned in the https://github.com/OpenZeppelin/Event-Scanner/pull/224/changes#r2852317356

We should add append, in cases where we need to fill a gap

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor reorg detection to use ring buffer parent hash verification

3 participants