[fix] Fix ManagedLedgerImpl's pendingAddEntries leak.#25240
[fix] Fix ManagedLedgerImpl's pendingAddEntries leak.#25240zhaizhibo wants to merge 3 commits intoapache:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses pendingAddEntries leaks in ManagedLedgerImpl when the managed ledger transitions to Closed or Fenced while there are queued add operations that were never initiated.
Changes:
- Add
OpAddEntry.closeIfNotInitiated()and use it to fail/remove queued-but-not-initiated add operations on close/fence. - Schedule the post-rollover metadata-update completion to run on the managed ledger executor thread.
- Clear pending writes when
ledgerClosed()is invoked while inFencedstate.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java | Adds a helper to atomically close an OpAddEntry if it has not been initiated. |
| managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java | Adds selective clearing of not-initiated pending adds on close/fence and adjusts rollover callback execution/threading and fenced handling. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Fixed |
|
add a test to reproduce the scenario where the ledger is closed before rollover completes(maybe ZooKeeper operations are slow). |
|
add a test to reproduce the scenario where the ledger is fenced before OpAddEntry.addcomplete(). |
e5a8dbd to
5eda9de
Compare
Denovo1998
left a comment
There was a problem hiding this comment.
Perhaps a unit test can be added:
Close occurs after updateLedgersListAfterRollover() has acquired the metadataMutex, but before operationComplete() has executed or just about to execute.
|
@Denovo1998 Thanks for you review, I will fix the commets soon. |
codelipenghui
left a comment
There was a problem hiding this comment.
Thanks for the fix. The two leak scenarios identified in the PR description are real and well-described. However, I found a few issues that should be addressed before merging. See inline comments.
660ba2a to
118b793
Compare
@codelipenghui Thanks for the review. All issues mentioned in the comments have been fixed. Please take another look. |
Motivation
There ard two scenarios where pendingAddEntries can leak in a ManagedLedger.
After an add operation, there may be a need to update the ledger's properties. If the ZooKeeper update fails, the callback sets the ManagedLedger state to fenced. Subsequent write operations that fail will trigger ManagedLedgerImpl#ledgerClosed. For a ManagedLedger in the fenced state, this can result in pendingAddEntries not being properly cleaned up, causing a leak.
The ManagedLedger needs to rollover because the current ledger is full, switching its state to closedLedger or creatingLedger. New incoming add requests are appended to the pendingAddEntries queue but have not yet been initiated. If the ManagedLedger is closed (state set to closed) after the rollover is executed but before its callback returns, the requests in the pendingAddEntries queue will never be completed, resulting in a leak.
Modifications
If the box was checked, please highlight the changes
Documentation
docdoc-requireddoc-not-neededdoc-completeMatching PR in forked repository
PR in forked repository: