fix(chainbase): defensive checks and resource leak fixes by barbatos2011 · Pull Request #10 · barbatos2011/java-tron

barbatos2011 · 2026-04-01T20:02:16Z

What

Replace assert statements with unconditional runtime checks on consensus-critical paths, fix division-by-zero risks, improve thread safety in Chainbase, and fix resource leaks across the entire database close() chain.

Why

Assert statements are disabled by default in production JVM (no -ea flag). On consensus-critical paths, this creates three failure modes depending on JVM configuration:

Without -ea (production default): assertions are silently skipped, code continues with invalid state (e.g., totalEnergyWeight = 0 causes (double) limit / 0L → Infinity → (long) Infinity = Long.MAX_VALUE, giving an account effectively unlimited energy)
With -ea: AssertionError (extends Error, not Exception) is thrown, bypassing all catch (Exception) handlers in the transaction processing chain, crashing the block processing thread
Mixed network: nodes with different -ea flags produce different execution results for the same block → consensus split

Replacing with if-throw using RuntimeException subclasses ensures: (1) the check always executes regardless of JVM flags, (2) the exception is caught by existing error handling, failing the transaction without crashing the node, (3) these conditions cannot be triggered by valid transactions on a correctly functioning network — they guard against data corruption or independent bugs in other subsystems.

Resource leaks in close() methods follow the same root cause across 4 classes: writeOptions.close() and dbSource.closeDB() in a single try block, where the first failure prevents the second from executing. The SnapshotManager.checkV2() leak is a separate issue: close() is not called at all when recover() throws.

Changes

Commit 1: Assert → if-throw replacement

Replace 5 assert statements with explicit if-throw checks:

Location	Condition	Exception
`RepositoryImpl.increase()`	`now < lastTime` (time backwards)	`IllegalArgumentException`
`RepositoryImpl.calculateGlobalEnergyLimit()`	`totalEnergyWeight <= 0`	`IllegalStateException`
`EnergyProcessor.calculateGlobalEnergyLimit()`	same, in else branch symmetric with `allowNewReward()` early return	`IllegalStateException`
`ResourceProcessor.increase()`	`now < lastTime` (same as RepositoryImpl)	`IllegalArgumentException`
`MerklePath.encode()`	`authenticationPath.size() != index.size()`	`ZksnarkException`

Add MerklePathEncodeTest and DefensiveChecksTest covering all 5 exception paths.

Commit 2: Defensive numeric and structural fixes

TransactionTrace: add if (newSize == 0) guard before newArea / newSize division in resetAccountUsage() and resetAccountUsageV2() to prevent ArithmeticException. The zero condition can occur when the caller account has no prior energy consumption history (raw window size = 0)
ResilienceService: replace all new Random() with ThreadLocalRandom.current() to avoid per-call object allocation in a scheduled task; remove Random field from WeightedRandom inner class. SecureRandom is unnecessary here — random peer disconnection only needs statistical uniformity, not cryptographic unpredictability
DbConvert: merge two parallel List<byte[]> (keys/values) into a single List<Map.Entry<byte[], byte[]>>, eliminating index-mismatch risk at the type level. batchInsert() is a private method only called within convertLevelToRocks()

Add TransactionTraceTest cases for newSize == 0 on both V1 and V2 paths.

Commit 3: Thread safety and resource leak fixes

Chainbase.head: declare as volatile for cross-thread visibility. Read methods (getUnchecked, prefixQuery, etc.) are not synchronized, so without volatile reader threads may see stale references under JMM
Chainbase.prefixQuery(): snapshot head into a local variable at method entry, pass it to both prefixQueryRoot() and prefixQuerySnapshot(). Previously, these two sub-methods read head independently — if setHead() is called between the two reads, the results are from different snapshots. volatile alone does not fix this; the local snapshot guarantees single-call consistency
TronDatabase.close(): split writeOptions.close() and dbSource.closeDB() into separate try-catch blocks so one failure does not skip the other
LevelDB.close() / RocksDB.close(): same split try-catch pattern (discovered by tracing the full close() chain — these had the identical bug)
CheckPointV2Store.close(): close own writeOptions in try-catch, then call super.close() instead of dbSource.closeDB() to ensure the full parent cleanup chain executes
SnapshotManager.checkV2(): wrap getCheckpointDB() in try-with-resources so close() runs even if recover() throws

Add ResourceCloseTest (mock writeOptions.close() to throw, verify dbSource.closeDB() is still called) and ChainbaseConcurrentTest (4 threads, 100 iterations of concurrent setHead + prefixQuery).

Scope

Does NOT change consensus rules or transaction validation logic
The assert replacements guard conditions that cannot be triggered by valid transactions on a correctly functioning mainnet — no hard fork required
Does NOT modify any public API or protobuf definitions

Test

All existing framework tests pass (2332 tests, 8 pre-existing net module flaky failures unrelated to this PR)
Added 4 new test classes, 12 new test methods:
- MerklePathEncodeTest: size mismatch → ZksnarkException; empty lists → no exception
- DefensiveChecksTest: energy weight zero throws/returns-zero based on allowNewReward(); time-backwards throws; RepositoryImpl zero weight throws
- TransactionTraceTest (2 new methods): resetAccountUsage V1/V2 handle newSize == 0
- ResourceCloseTest: exception-safe close chain verification with Mockito
- ChainbaseConcurrentTest: concurrent read/write safety
Checkstyle passes

barbatos2011 · 2026-04-01T20:16:19Z

PR Review: fix/defensive-code-safety

Overall direction is good. Here are the issues found:

1. `Chainbase` volatile fix is incomplete

prefixQuery() now uses a local snapshot of head, but other methods that read head (e.g. get(), getUnchecked()) do not apply the same treatment. If those methods are also accessed concurrently, this fix only covers one entry point. Recommend auditing all paths that read head.

2. `TransactionTrace.resetAccountUsageV2()` — windowSize reset when newSize==0 needs verification

The V1 path calls setNewWindowSize(ENERGY, 0L) when newSize == 0, while the V2 path calls setNewWindowSizeV2(ENERGY, 0L). The V1 windowSize is not reset in the V2 path. Please confirm that V2 mode never uses the V1 windowSize, otherwise stale data may remain.

3. `CheckPointV2Store.close()` — subclass and parent each have their own `writeOptions` field

The subclass calls writeOptions.close() then super.close() which closes the parent's writeOptions. This is technically correct since Java fields are not polymorphic, but two identically-named fields are confusing. Consider adding a comment explaining why two separate writeOptions instances need to be closed.

4. `DbConvert` — `AbstractMap.SimpleEntry` overhead

Merging two parallel lists into List<Map.Entry<byte[], byte[]>> eliminates key/value desync risk — good direction. However, AbstractMap.SimpleEntry adds an extra object allocation per iteration. For large database conversions this may have a performance impact. Consider using a simple inner record/class instead.

5. `ResilienceService.WeightedRandom` — `totalWeight` may be 0

int randomNum = ThreadLocalRandom.current().nextInt(totalWeight);

If weights is empty or all weights are 0, totalWeight == 0 and nextInt(0) throws IllegalArgumentException. The original code had the same issue, but since this PR focuses on defensive fixes, it's worth adding a guard here too.

6. Test issues

ChainbaseConcurrentTest: 100 iterations is too few to reliably trigger concurrency bugs; chainbase.reset() / chainbase.close() are not in a finally block so resources leak if the assertion fails; consider adding @Rule Timeout to prevent hangs.
ResourceCloseTest: Heavy use of reflection to replace fields — any field rename will break the test. Fragile.
DefensiveChecksTest: Multiple tests reuse the same hex address string; consider extracting it as a constant.

7. Commit granularity

Commit e7a95dd ("thread safety and resource leak fixes") bundles multiple unrelated change types (volatile, resource close, try-with-resources). Consider splitting into finer-grained commits.

🤖 Generated with Claude Code

barbatos2011 · 2026-04-01T20:25:29Z

Updated Review (after `dad269ac0`)

Most of the previous feedback has been addressed. Here's the updated status:

#	Issue	Status
1	`Chainbase` volatile fix incomplete	Addressed — comment explains `get()`/`getUnchecked()` only read `head` once via `head()`, so volatile alone suffices.
2	`TransactionTrace` V2 windowSize reset	Not addressed — see below.
3	`CheckPointV2Store` dual writeOptions confusion	Addressed — Javadoc now clearly explains the two fields.
4	`DbConvert` `AbstractMap.SimpleEntry` overhead	Not addressed — minor, acceptable as-is.
5	`ResilienceService` totalWeight==0	Addressed — guard added with `IllegalStateException`.
6	`ChainbaseConcurrentTest` quality	Addressed — iterations bumped to 500, `finally` block added, `@Rule Timeout` added.
7	Commit granularity	Not addressed — acceptable since the fix commit is additive.

Remaining concern

TransactionTrace.resetAccountUsageV2() newSize==0 guard — when newSize == 0, the V2 path resets windowSizeV2 via setNewWindowSizeV2(ENERGY, 0L) but does not reset the V1 windowSize. The V1 path does the opposite (setNewWindowSize). If V2 mode truly never reads V1 windowSize this is fine, but please confirm or add a comment explaining why only V2 needs resetting.

Other than that, LGTM.

🤖 Generated with Claude Code

barbatos2011 · 2026-04-01T20:27:23Z

Thanks for the thorough re-review.

Regarding the remaining concern on resetAccountUsageV2() — this is actually safe. setNewWindowSizeV2 internally delegates to setNewWindowSize:

// AccountCapsule.java:1457
public void setNewWindowSizeV2(ResourceCode resourceCode, long newWindowSize) {
    this.setNewWindowSize(resourceCode, newWindowSize);  // ← resets V1 windowSize
    if (!this.getWindowOptimized(resourceCode)) {
        ...
    }
}

So the V2 path does reset the V1 windowSize — setNewWindowSizeV2 is a superset of setNewWindowSize, not an alternative. Both V1 and V2 paths end up writing to the same underlying energyWindowSize protobuf field.

barbatos2011 · 2026-04-01T20:28:18Z

Verified — setNewWindowSizeV2 delegates to setNewWindowSize at AccountCapsule.java:1458, so V1 windowSize is correctly reset in the V2 path. No remaining concerns. LGTM 👍

🤖 Generated with Claude Code

barbatos2011 · 2026-04-01T20:38:55Z

Suggestion: split this PR by risk level

The changes in this PR are all "defensive improvements", but they vary significantly in risk and impact. Grouping them together makes review harder and increases revert blast radius.

Change	Risk	Impact
assert→if-throw	Low	Behavior-equivalent (assert is disabled in production anyway)
ThreadLocalRandom	Low	Nearly no functional change
DbConvert parallel list merge	Low	Only affects offline tooling
Resource close ordering	Medium	Affects node shutdown flow
TransactionTrace divide-by-zero guard	High	Changes core transaction processing control flow with new early returns
Chainbase volatile + local snapshot	High	Concurrency semantic change affecting all prefixQuery calls

Recommend splitting into 3 PRs:

assert→if-throw + ThreadLocalRandom — mechanical replacements, low risk, can be merged quickly
Resource close safety + DbConvert refactor — resource management improvements, medium risk, independently testable
TransactionTrace divide-by-zero guard + Chainbase thread safety — core logic changes, needs the most careful review

Reasons:

If the Chainbase concurrency fix introduces a regression, reverting it would also roll back unrelated assert replacements
The TransactionTrace early return changes actual business logic — different in nature from "defensive checks" and easy to overlook when mixed together
Low-risk changes should not be blocked by high-risk ones

🤖 Generated with Claude Code

barbatos2011 · 2026-04-01T20:39:51Z

Thanks for the suggestion. I see the point about revert blast radius, but I'd push back on the split for this case:

1. These changes share the same audit origin and review context.

All items come from the same security audit report and address the same category: "defensive code that doesn't actually defend in production." Splitting into 3 PRs means 3 rounds of CI, 3 review cycles, and 3 sets of merge conflicts to manage — for a total of ~170 lines of non-test changes.

2. The "high risk" items are lower risk than they appear.

TransactionTrace divide-by-zero guard: the early return sets energyUsage = 0 and windowSize = 0, which is the mathematically correct result when the window size is zero (no consumption window → no usage). It doesn't change the result for any case where newSize > 0. The condition is unlikely to occur on mainnet (requires an account with no prior energy consumption history), and if it does occur today, the node crashes with ArithmeticException — the fix is strictly better.
Chainbase volatile: volatile on a reference field is a standard JMM fix with near-zero overhead on x86. The prefixQuery local snapshot is a 1-line variable assignment that eliminates a read-read inconsistency. Both are minimal, well-understood patterns.

3. The current 3-commit structure already provides granular revert.

Each commit is independently revertable via git revert <hash>:

Commit 1 (assert→if-throw) can be reverted without touching Commit 2 or 3
Commit 3 (Chainbase + resource leaks) can be reverted independently

This gives the same revert granularity as separate PRs without the overhead.

That said, if the team prefers separate PRs as a policy, I'm happy to split. Just want to make sure the tradeoff is explicit.

Replace 5 assert statements with unconditional runtime checks on consensus-critical paths. assert is disabled by default in production JVM (no -ea flag), silently skipping invariant checks. With -ea enabled, AssertionError bypasses catch(Exception) handlers, crashing the node. - RepositoryImpl.increase(): time-backwards check - RepositoryImpl.calculateGlobalEnergyLimit(): zero energy weight check - EnergyProcessor.calculateGlobalEnergyLimit(): same, symmetric with allowNewReward() early return - ResourceProcessor.increase(): time-backwards check - MerklePath.encode(): authentication path / index size mismatch

- TransactionTrace: add zero-check guard before newArea/newSize division in resetAccountUsage() and resetAccountUsageV2() to prevent ArithmeticException when window size is zero - ResilienceService: replace all new Random() with ThreadLocalRandom.current() to avoid unnecessary object allocation; remove Random field from WeightedRandom inner class - DbConvert: merge parallel List<byte[]> (keys/values) into single List<Map.Entry<byte[], byte[]>> to eliminate index-mismatch risk

- Chainbase.head: declare as volatile for cross-thread visibility - Chainbase.prefixQuery(): snapshot head into local variable at entry and pass to prefixQueryRoot()/prefixQuerySnapshot() to guarantee consistency within a single call - TronDatabase.close(): split writeOptions.close() and dbSource.closeDB() into separate try-catch blocks so one failure does not skip the other - CheckPointV2Store.close(): close own writeOptions in try-catch then call super.close() for full parent cleanup chain - LevelDB.close() / RocksDB.close(): same split try-catch pattern - SnapshotManager.checkV2(): wrap getCheckpointDB() in try-with-resources so close() runs even if recover() throws

barbatos2011 changed the title ~~fix(actuator,chainbase,net,plugins): replace assert with explicit checks and defensive fixes~~ fix(chainbase): defensive checks and resource leak fixes Apr 1, 2026

barbatos2011 force-pushed the fix/defensive-code-safety branch from dad269a to c378136 Compare April 1, 2026 20:32

Barbatos added 3 commits April 2, 2026 04:44

barbatos2011 force-pushed the fix/defensive-code-safety branch from c378136 to 4e171df Compare April 1, 2026 20:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(chainbase): defensive checks and resource leak fixes#10

fix(chainbase): defensive checks and resource leak fixes#10
barbatos2011 wants to merge 3 commits intodevelopfrom
fix/defensive-code-safety

barbatos2011 commented Apr 1, 2026 •

edited

Loading

Uh oh!

barbatos2011 commented Apr 1, 2026

Uh oh!

barbatos2011 commented Apr 1, 2026

Uh oh!

barbatos2011 commented Apr 1, 2026

Uh oh!

barbatos2011 commented Apr 1, 2026

Uh oh!

barbatos2011 commented Apr 1, 2026

Uh oh!

barbatos2011 commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

barbatos2011 commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

Changes

Commit 1: Assert → if-throw replacement

Commit 2: Defensive numeric and structural fixes

Commit 3: Thread safety and resource leak fixes

Scope

Test

Uh oh!

barbatos2011 commented Apr 1, 2026

PR Review: fix/defensive-code-safety

1. Chainbase volatile fix is incomplete

2. TransactionTrace.resetAccountUsageV2() — windowSize reset when newSize==0 needs verification

3. CheckPointV2Store.close() — subclass and parent each have their own writeOptions field

4. DbConvert — AbstractMap.SimpleEntry overhead

5. ResilienceService.WeightedRandom — totalWeight may be 0

6. Test issues

7. Commit granularity

Uh oh!

barbatos2011 commented Apr 1, 2026

Updated Review (after dad269ac0)

Remaining concern

Uh oh!

barbatos2011 commented Apr 1, 2026

Uh oh!

barbatos2011 commented Apr 1, 2026

Uh oh!

barbatos2011 commented Apr 1, 2026

Suggestion: split this PR by risk level

Uh oh!

barbatos2011 commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

barbatos2011 commented Apr 1, 2026 •

edited

Loading

1. `Chainbase` volatile fix is incomplete

2. `TransactionTrace.resetAccountUsageV2()` — windowSize reset when newSize==0 needs verification

3. `CheckPointV2Store.close()` — subclass and parent each have their own `writeOptions` field

4. `DbConvert` — `AbstractMap.SimpleEntry` overhead

5. `ResilienceService.WeightedRandom` — `totalWeight` may be 0

Updated Review (after `dad269ac0`)