Skip to content

EMA update wobble#564

Merged
illuzen merged 10 commits into
mainfrom
illuzen/ema-update-wobble
May 21, 2026
Merged

EMA update wobble#564
illuzen merged 10 commits into
mainfrom
illuzen/ema-update-wobble

Conversation

@illuzen
Copy link
Copy Markdown
Contributor

@illuzen illuzen commented May 20, 2026

The difficulty was wobbling quite violently, this changes the ema parameters to be similar to Ethereum

  • asymmetric clamps (5% down, 0.05% up)
  • 1% weight on most recent block

In tests this smoothed the difficulty tremendously, with less than 20% oscillation even when adding and removing a gpu:

Screenshot 2026-05-20 at 4 29 37 PM />

Note

High Risk
High risk because it changes the core PoW difficulty adjustment algorithm and related runtime/storage API, which can materially impact block production stability and network consensus behavior.

Overview
Reworks QPoW difficulty adjustment by removing EMA-based tracking/clamping and switching pallet_qpow::calculate_difficulty to an Ethereum-style per-block adjustment using last block duration, with updated min difficulty floor and new runtime InitialDifficulty.

Removes the BlockTimeEma storage/runtime API surface and related config constants, updates benchmarks/tests/weights accordingly, and updates Prometheus/Grafana to stop emitting/graphing block_time_ema and instead show last_block_duration.

Reviewed by Cursor Bugbot for commit d563a32. Bugbot is set up for automated code reviews on this repo. Configure here.

@v12-auditor
Copy link
Copy Markdown

v12-auditor Bot commented May 20, 2026

Note

Complete: Audit complete. V12 found one issue worth reviewing.

Open the full results here.

FindingSeverityDetails
F-67934 🔴 Critical
Difficulty floor becomes permanent

calculate_difficulty now uses an asymmetric clamp that allows at most a 1/2048 increase but still clips every result below get_min_difficulty() back to 1000. Because the fixed-point multiplication is truncated back to an integer, the best possible recovery step from the floor is floor(1000 * 2049 / 2048) = 1000, so once the chain reaches 1000 it can never leave that value through normal block production. adjust_difficulty stores this result into CurrentDifficulty every block, and subsequent nonce checks always read the stored difficulty, so the network stays pinned to trivial PoW until a runtime upgrade or manual storage repair resets it. The same diff also makes the floor reachable in practice by allowing 5% downward moves while limiting recovery to only about 0.0488% per block.

Analyzed three files, diff e661b3c...e96abc0.

Comment thread runtime/src/configs/mod.rs Outdated
@n13
Copy link
Copy Markdown
Collaborator

n13 commented May 20, 2026

I found the authoritative sources on Ethereum's difficulty adjustment. The PR's claim of "Ethereum-style" is fundamentally misleading — Ethereum's algorithm differs from this PR's in critical ways that affect security.

Authoritative sources on Ethereum's difficulty adjustment

Source Algorithm Notes
EIP-2 (Homestead) parent_diff + parent_diff/2048 * max(1 - (block_time)//10, -99) The original "1/2048" formula
EIP-100 (Byzantium) ... max((2 if uncles else 1) - (block_time)//9, -99) Includes uncles
go-ethereum consensus.go Reference implementation Lines 415–462 of the cached copy
ethereum.stackexchange explainer Worked examples of the formula
Cotten.io: EIP 2/4 15-second target Casual explanation

What Ethereum actually does (from go-ethereum)

# Homestead
diff = parent_diff + (parent_diff / 2048) * max(1 - (block_time // 10), -99)

Key properties:

  • 2048 is the DifficultyBoundDivisor — the same unit step for both up and down
  • adj_factor is an integer in [-99, +1] (Homestead) or [-99, +2] (Byzantium w/ uncle)
  • The adjustment is additive, not multiplicative, and driven by the single block's wall-clock time — not an EMA

Per-block bounds:

  • Max UP: +1 × parent/2048 = +0.0488% per block
  • Max DOWN: −99 × parent/2048 = −4.83% per block

The trigger pattern is what matters:

Single block time adj_factor Effective change
1–9 s +1 +0.0488%
10–19 s 0 0
20–29 s −1 −0.0488%
30–39 s −2 −0.0977%
100–109 s −9 −0.439%
990 s+ −99 (capped) −4.83%

So Ethereum's -99 cap only activates after a single block takes ~16.5 minutes. Normal slow blocks (e.g. 30 s when target is 13 s) move difficulty down by only ~0.1%.

What PR #564 actually does

raw_ratio = target_time / observed_ema_time
if raw_ratio > 1:
    ratio = min(raw_ratio, 1 + 1/2048)   // capped UP at +0.0488%
else:
    ratio = max(raw_ratio, 1 - 5/100)    // capped DOWN at −5%
new_diff = current_diff * ratio
  • Multiplicative, not additive
  • Driven by an EMA (alpha = 1%, ~100-block window) rather than the latest block
  • Max DOWN of −5% triggers as soon as the EMA exceeds target by 5.3% (e.g. 12.63 s when target is 12 s) — orders of magnitude easier than Ethereum's 16.5-minute threshold
  • Max UP of +0.0488% triggers as soon as the EMA is below target by 0.05% (~6 ms on 12 s target)

Why "Ethereum params" is a misleading framing

Property Ethereum (Homestead/Byzantium) This PR
Step size parent/2048 (~0.049%) base step, scaled by signed integer adj_factor Asymmetric clamps: +1/2048 up, −5/100 down
Per-block max DOWN −4.83%, but only at block_time ≥ 990 s −5% at EMA only ~5% above target
Per-block max UP +0.0488% on any sub-target single block +0.0488% on any sub-target EMA
Input signal Direct single-block time Smoothed EMA
Down-direction trigger threshold Catastrophic stall (single 16+ min block) Routine variance
Geometry Additive linear in block time Multiplicative step function

Ethereum's asymmetry is "in extreme catastrophe, drop fast; otherwise both directions move at ~0.05% × |delta| steps." This PR's asymmetry is "in routine slow patches, drop fast; in fast patches, always rise at the tiny floor."

The two are not equivalent — and the difference is exactly the source of the exploitable behavior I flagged earlier:

  1. min_difficulty floor trap: 1000 × (1 + 1/2048) = 1000.488 → 1000 under U512 truncation. Once difficulty hits the floor, the +1/2048 clamp cannot lift it by even 1 unit. (FixedU128 DIV = 10^18 confirmed in sp-arithmetic-28.0.1/src/fixed_point.rs:2412). Ethereum's additive formula has the same parent/2048 step but MinimumDifficulty = 131_072 = 2^17, well above the escape threshold; the PR's floor of 1000 is below the 2048 threshold needed to escape. The get_min_difficulty() comment ("1000 is safe for clamp values ≥ 0.01%") is mathematically wrong: the relevant constraint is min × up_clamp ≥ 1, which requires min ≥ 2048 for a 1/2048 clamp.

  2. Asymmetric hashrate-response window: To double difficulty in response to a 2× hashrate spike — ln(2)/ln(1.000488) ≈ 1420 blocks ≈ 4.7 hours at the 12 s target. Ethereum's equivalent only allows the same +0.0488% per block, but block time is the direct input, so when a 2× hashrate spike arrives blocks come in ~6 s and Ethereum gets +0.0488% per block; the time to catch up is the same (Ethereum has the same hashrate-bandit weakness, which is one reason Ethereum left PoW). What's new here is that the EMA dampens the signal further, slowing response.

  3. Routine-variance drift: With EMA σ ≈ 7.1% of mean (alpha = 1%, single-block CV = 1), the EMA is above 1.053 × target ~23% of the time at equilibrium; each such block snaps difficulty down by 5%, while compensating upward steps are limited to 0.05%. The equilibrium is not at "correct" difficulty — it's biased low.

Closer reference points for what was probably intended

If the goal was Ethereum-style asymmetry, the closer port would be additive with a single divisor, signed integer multiplier, and a −99 lower cap on the multiplier:

# additive form, like EIP-2
step = current_difficulty / 2048
adj_factor = clamp(integer_function_of(observed_block_time, target), -99, +1)
new_difficulty = current_difficulty + step * adj_factor

Notably, EIP-2 §Rationale explicitly argues for using the block time directly (not a moving average) so that a single delayed block doesn't keep penalizing future blocks via a slow-decaying EMA. The PR's EMA-then-clamp construction is what produces the "drop 5% every block for ~100 blocks after a single outage" behavior.

Want me to write up a corrected implementation, or stop at the review and let the PR author redesign?

@n13
Copy link
Copy Markdown
Collaborator

n13 commented May 20, 2026

Review of PR #564 — "EMA update wobble"

Scope: security and soundness of the new asymmetric difficulty-clamp + EMA retune.
Verdict: do not merge as-is. There is a chain-killing bug (difficulty floor trap), a misleading framing as "Ethereum params" that produces fundamentally different dynamics, and a structural downward bias in the difficulty controller that an attacker can amplify.


TL;DR — exploitable / soundness findings

# Severity Issue
1 Critical get_min_difficulty() = 1000 is below the escape threshold for the new 1/2048 up-clamp. Once difficulty reaches the floor, integer truncation in U512 arithmetic permanently traps it there.
2 Critical The clamps are not "Ethereum's clamps." Ethereum's −99 cap requires a single block ≥ 990 s. This PR's −5% cap fires on a routine 5%-above-target EMA. Combined with EMA smoothing, this gives the controller a strong systematic downward bias.
3 High Hashrate-shock response is asymmetric by a factor of ~100×. An attacker who brings hashpower online for a few hours mines a disproportionate share of blocks because the up-clamp throttles recovery to +0.0488% / block (≈4.7 h to double; ≈13 h to 10×).
4 High The EMA window is also retuned (alpha = 100 → 10, i.e. 10% → 1% weight on the newest block). After an outage, the EMA stays elevated for ~100 blocks while the down-clamp keeps firing −5%/block, so a single ~3-hour outage can drain the difficulty into the floor trap.
5 Medium Test mocks use (3/100, 10/100) but the runtime uses (1/2048, 5/100). The unit tests don't actually exercise production behaviour. The outdated comment in test_min_difficulty_can_increase ("ratio clamped to 1.1") confirms the test was originally written against a 10% clamp and was not redesigned for 1/2048.
6 Medium Documentation drift: the doc comments on DifficultyIncreaseClamp / DifficultyDecreaseClamp give 3/100 and 10/100 as illustrative values, far from the actual production 1/2048 and 5/100. The comment in get_min_difficulty ("1000 is safe for clamp values ≥ 0.01%") is mathematically incorrect for the new up-clamp.
7 Low InitialDifficulty raised from 1_189_189 to 2_700_000 with no justification in the PR body or in code comments. Existing // low value for development comment was deleted instead of updated.
8 Low FixedU128::from_rational rounds toward zero. Combined with the asymmetric clamp it adds an additional small downward bias on every adjustment.

1. Critical — min_difficulty floor trap

pallets/qpow/src/lib.rs:

pub fn get_min_difficulty() -> Difficulty {
    // 1000 is safe for clamp values >= 0.01%
    U512::from(1000u64)
}

With the runtime's new DifficultyIncreaseClamp = FixedU128::from_rational(1, 2048) (~0.0488%):

FixedU128 uses DIV = 10^18 (confirmed in sp-arithmetic-28.0.1/src/fixed_point.rs:2412). The adjustment computes:

ratio              = 1 + 1/2048  →  1_000_488_281_250_000_000  (as FixedU128 inner)
numerator          = 1000 * 1_000_488_281_250_000_000
                   = 1_000_488_281_250_000_000_000
adjusted (U512 /)  = numerator / 10^18  =  1000.488…  →  floor → 1000

So at current_difficulty == 1000, the largest possible upward step lands back at 1000. The post-multiplication if adjusted < min_difficulty re-clip cannot help, because the multiplication already truncated to the floor.

The escape condition is min × up_clamp ≥ 1, i.e. with up_clamp = 1/2048, the floor must be ≥ 2048, not 1000. The comment "1000 is safe for clamp values ≥ 0.01%" is wrong: the actual constraint is min ≥ 1 / up_clamp.

Can the floor actually be reached?

From InitialDifficulty = 2_700_000, reaching 1000 is ln(2700)/ln(0.95) ≈ 154 blocks of binding down-clamp. The down-clamp binds whenever the EMA exceeds target by ≥ 5.3%. With alpha = 0.01 and any extended slow patch (validator outage, regional internet failure, big miner powering down) the EMA easily climbs and stays elevated for the entire 154-block window.

Once trapped, the chain is permanently mineable by anyone — difficulty 1000 means a hash threshold of U512::MAX / 1000, which any random hash satisfies. Liveness is preserved but the chain is no longer cryptoeconomically secure.

Cheapest fix

pub fn get_min_difficulty() -> Difficulty {
    // Floor must be > 1 / DifficultyIncreaseClamp so integer arithmetic
    // can ever lift it. With clamp = 1/2048, this needs >= 2048.
    U512::from(4096u64)
}

…or better, derive it from the clamp at runtime: U512::from((2 * one.into_inner() / increase_clamp.into_inner()) as u64) so the constraint cannot drift again.


2. Critical — the algorithm is not "Ethereum's"

The PR description says the changes "use Ethereum params." The actual Ethereum (Homestead / EIP-2, Byzantium / EIP-100) algorithm is:

# Homestead (EIP-2)
diff = parent_diff + (parent_diff // 2048) * max(1 - (block_timestamp - parent_timestamp) // 10, -99)

This differs from this PR on every important axis:

Property Ethereum (EIP-2 / EIP-100) PR #564
Form Additive: parent + parent/2048 × adj Multiplicative: parent × ratio
Step size One unit = parent / 2048 for both directions Up step ≤ parent × 1/2048, down step ≤ parent × 5/100
Signal The actual single-block time EMA of recent block times
Up cap adj = +1 on any sub-10 s block +1/2048 whenever raw_ratio > 1.000488
Down cap adj = −99 only at block_time ≥ 990 s −5% whenever EMA > 1.053 × target
Max DOWN trigger A single 16.5-minute block EMA drifts 5% above target

So the 1/2048 in Ethereum is the base unit step that is then multiplied by a signed integer adjustment factor — it is not a percentage cap on the upward direction. And the −99 cap corresponds to a catastrophic single-block stall, not the moderate slow-patch this PR uses as the trigger.

Two illustrative slow-block cases:

Single block time / EMA, target = 12 s Ethereum adj This PR (raw_ratio mode)
18 s 1 - 1 = 00% change EMA ≈ 18 s → ratio = 0.667, clamped to −5%
30 s 1 - 3 = -2−0.097% EMA ≈ 30 s → ratio = 0.4, clamped to −5%
5 s 1 - 0 = +1+0.0488% EMA ≈ 5 s → ratio = 2.4, clamped to +0.0488%

The PR's controller treats a routine 30 s block (about 2.5× target) the same as a 5-hour stall; both clamp to −5%. Ethereum treats them very differently (−0.097% vs. −4.83%).

Why this matters

Because the down-clamp triggers on routine deviations and the up-clamp's threshold is essentially zero (only 0.05% below target needed), the EMA spends most of its time in one of the two clamped regimes. In a stochastic equilibrium with σ(EMA)/μ ≈ √(α/(2-α)) ≈ 7.1% (for α = 1% and exponential block-time variance):

  • P(EMA > 1.053 × target) ≈ Φ(−0.74) ≈ 23% — clamped −5%
  • P(0 < eps < 0.053) ≈ 27% — linear small drop, average ≈ −2.5%
  • P(EMA < target) ≈ 50% — almost all clamped at +0.0488%

Expected per-block change at "correct" difficulty:

E[Δ] ≈ 0.23 × (−0.05) + 0.27 × (−0.025) + 0.50 × (+0.0005)
     ≈ −0.0115 − 0.0068 + 0.0002
     ≈ −1.8% per block

This is uncompensated downward drift. The controller's stable point is below the correct difficulty, which means block times average below target at equilibrium — the opposite of the desired smoothing. The PR's screenshot shows oscillation amplitude shrinking in a short window with a GPU added/removed; it does not demonstrate equilibrium stability over the >100-block EMA window.

References


3. High — asymmetric response makes hashrate-bandit attacks ~100× more profitable

To restore difficulty after a hashrate surge of requires roughly ln(k) / ln(1 + 0.000488) blocks of binding up-clamp:

Hashrate spike Blocks to catch up Wall-clock at 12 s target
~1 420 ~4.7 hours
~3 295 ~11 hours
10× ~4 716 ~15.7 hours

Compare with the previous symmetric 10% clamp: ln(10)/ln(1.1) ≈ 24 blocks ≈ 4.8 minutes to track a 10× spike. The new tuning is roughly 200× slower at responding to upward hashrate shocks.

This is a classic coin-hopping / hashrate-bandit setup: a multi-pool attacker rotates a large rented hashrate on for a few hours, mines a windfall of cheap blocks, and rotates off before the controller catches up. The 5% down-clamp then snaps difficulty back down quickly — making the attacker's next visit equally profitable.


4. High — slowed EMA amplifies the down-clamp trap

EmaAlpha changed from 10010 (out of 1000), so the latest block now contributes 1% (was 10%). Effective half-life is ~69 blocks (was ~7 blocks).

The combined behavior:

  1. Some triggering event (validator nap, GPU offline, network split) makes a chunk of blocks slow.
  2. EMA climbs and stays elevated for ~100 blocks even after blocks return to normal.
  3. While EMA > 1.053 × target, every block decrements difficulty by 5%.
  4. Recovery upward is limited to +0.0488% per block.

A back-of-envelope check: a sustained outage producing blocks 5× target time for 1 hour (~600 s actual blocks ⇒ 6 blocks) is mild. But a 3-hour outage producing blocks at 10× target time → ~90 actual blocks. Each saw −5%. Net: 0.95^90 ≈ 0.99%. Starting from 2.7M, that lands at ~26 760 — recoverable but only at +0.0488%/block, which is ~9 600 blocks (~32 h at 12 s target) of recovery. One ~6-hour outage and the floor trap from §1 is reached.


5. Medium — tests don't exercise the production controller

pallets/qpow/src/mock.rs:

pub const TestDifficultyIncreaseClamp: FixedU128 = FixedU128::from_rational(3, 100);
pub const TestDifficultyDecreaseClamp: FixedU128 = FixedU128::from_rational(10, 100);

The mock uses a 3% / 10% pair while the runtime uses 1/2048 (~0.05%) / 5%. The mock numbers are ~60× larger on the up side and 2× larger on the down side. Consequences:

  • test_min_difficulty_can_increase passes with 1000 × 1.03 = 1030 (mock), but fails with 1000 × 1.000488 = 1000 (runtime). The floor-trap bug from §1 is not caught by any test.
  • The comment in test_min_difficulty_can_increase still reads: // Fast blocks → ratio clamped to 1.1 → floor(1000 * 1.1) = 1100, which corresponds to neither the old 10% clamp nor the new 3% mock. It's left over from a prior parameter set.
  • test_difficulty_recovers_after_sleep uses alpha = 500 (50%) which is wildly different from the runtime's alpha = 10 (1%). It is asserting against numbers that are 50× more responsive than production.

The mock should be aligned with the runtime parameters (or the runtime parameters tested directly under #[cfg(test)] integration tests), and there should be a regression test that asserts:

let lifted = QPow::calculate_difficulty(QPow::get_min_difficulty(), 1, target);
assert!(lifted > QPow::get_min_difficulty());

— which would fail today against the runtime config.


6. Medium — docs/comments drift

  • New doc comments on the Config items give 3/100 and 10/100 as illustrative values; runtime uses 1/2048 and 5/100. Pick one and be consistent, or drop the inline percentages.
  • get_min_difficulty's comment is mathematically wrong for the new clamp (see §1).
  • The PR removed the "low value for development" hint on InitialDifficulty rather than replacing it with a justification for 2_700_000.

7. Low — bias from FixedU128::from_rational rounding

from_rational in sp_arithmetic::FixedU128 rounds down by default. So:

  • When target > observed (up direction), raw_ratio is slightly under-stated, costing fractional bps on every adjustment.
  • When target < observed (down direction), raw_ratio is also under-stated, which makes the down move slightly larger.

This is fourth-order compared to §1–§4, but it's directional bias on top of an already-biased controller. Consider FixedU128::from_rational_with_rounding(.., NearestPrefDown) or computing target - observed in the right direction.


8. Suggested concrete revisions

  1. Block merging until §1 is fixed. This is a clean chain-kill.
  2. Either:
    • Raise get_min_difficulty() to at least 2 / up_clamp (≥ 4096 for 1/2048), and derive it from T::DifficultyIncreaseClamp instead of hard-coding, or
    • Move to an additive Ethereum-style update (diff + diff/2048 × adj_factor) where the floor is naturally escapable.
  3. If the goal is genuinely "Ethereum's params":
    • Use the additive formula in EIP-2 / EIP-100.
    • Drive adj_factor from the single block's time, not the EMA — see EIP-2 §Rationale. If EMA smoothing is desired, do it on the difficulty target itself (i.e. EMA of computed adjustments), not on the input signal.
    • Match MinimumDifficulty = 131_072 (Ethereum's value) or equivalent — well above the 2048 escape threshold.
  4. Align pallets/qpow/src/mock.rs with the runtime params so unit tests actually exercise the production controller; add a regression test for the floor case (calculate_difficulty(min, 1, target) > min).
  5. Tighten the upward clamp threshold asymmetry as well, not just the cap: today the up-clamp binds for any EMA below target by more than 0.05%, throwing away almost all signal. Consider making the up and down clamp triggers symmetric in |delta| even if the caps differ.
  6. Document the choice of InitialDifficulty = 2_700_000 (or revert it — it's an orthogonal change that shouldn't ride a controller-tuning PR).

Appendix — numerical sanity check (Rust pseudo-snippet)

use sp_arithmetic::FixedU128;
use sp_core::U512;

let one_inner: u128 = FixedU128::one().into_inner(); // 1e18
let up_inner: u128 = FixedU128::from_rational(1, 2048).into_inner();
let ratio_inner = one_inner + up_inner;            // ~1.000488e18

let current = U512::from(1000u64);
let numerator = current * U512::from(ratio_inner); // 1.000488e21
let adjusted = numerator / U512::from(one_inner);  // = 1000

assert_eq!(adjusted, U512::from(1000u64));         // ← floor trap

@illuzen
Copy link
Copy Markdown
Contributor Author

illuzen commented May 20, 2026

omg this review is quite hysterical, yes it's not exactly like ethereum's but it has the same clamps. the supporting links it gives are completely irrelevant and it's analysis is bogus.

the min_difficulty thing is correct tho, so will fix that.

n13 added a commit that referenced this pull request May 20, 2026
…ants

- Warn on raw-result clipping to floor/ceiling so a stuck controller
  is diagnosable from logs alone.
- Add proptest invariants for the additive controller: bounds,
  monotone-in-block_time, no-change band flatness, step magnitude.
- Add runtime-parameter integration test asserting the actual
  quantus_runtime constants form a coherent EIP-2 configuration —
  catches the mock/runtime drift that hid the floor trap in #564.
- Remove dead TooFarInFuture variant; the 60s drift bound is
  intentionally enforced via pallet_timestamp::check_inherents.
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit d08ec77. Configure here.

Comment thread pallets/qpow/src/lib.rs
Comment thread pallets/qpow/src/lib.rs
@illuzen
Copy link
Copy Markdown
Contributor Author

illuzen commented May 20, 2026

ok let's just go full ethereum on it

@n13
Copy link
Copy Markdown
Collaborator

n13 commented May 20, 2026

I've done a deep dive on the rewritten PR. Here's a comprehensive review covering all the discussion threads and the new "all-Ethereum" code at commit 2fa368d.

Overall verdict

The pivot is a huge improvement and the core algorithm is now genuinely EIP-2 Homestead style (additive, single-block driven, with +1 / −99 integer adjustment factor). Every Critical / High concern that v12-auditor, n13, and cursor[bot] raised against the original EMA-clamp design has been structurally fixed by the rewrite:

Earlier finding Status
Floor trap (1000 × (1 + 1/2048) → 1000) Fixed — additive form + min = 131_072
"Ethereum-style" was misleading framing Fixed — now actually EIP-2 additive
Asymmetric hashrate-bandit window Fixed — +1/2048 step both directions
EMA dampens single-block penalty for ~100 blocks Fixed — EMA gone, single block input
Routine-variance downward drift Fixed — additive +1/-99 is unbiased near target (modulo bucket calibration, see below)
FixedU128::from_rational rounding bias Fixed — no FixedU128 in the math path

So the chain-killing stuff is gone. What's left is calibration, cleanup, and one migration concern.

Things that still need attention

1. Bucket ratio is 5/6 instead of the canonical 2/3 (calibration drift)

			// Divisor scales with target: 10s divisor for 12s target
			// divisor = target * 10 / 12 = target * 5 / 6
			let divisor_ms = (target_time_ms * 5 / 6).max(1);
			let time_factor = (block_time_ms / divisor_ms) as i64;
			let adjustment = core::cmp::max(1i64 - time_factor, -99i64);

Both Ethereum eras use ratio 1.5 (target / divisor):

  • EIP-2 Homestead: target ≈ 15 s, divisor 10 s → 15/10 = 1.5
  • EIP-100 Byzantium: target ≈ 13.5 s, divisor 9 s → 13.5/9 = 1.5

This PR uses target * 5 / 6 → 12/10 = 1.2.

The consequence is that the stochastic equilibrium of an EIP-2 controller with exponential block times is μ = D / ln 2:

  • divisor = 10 s (this PR) → equilibrium ≈ 14.4 s (≈ 20% slow vs the 12 s target — the target sits at the lower edge of the [10s, 20s) no-change band)
  • divisor = 8 s (PR qpow: Ethereum EIP-2 additive difficulty adjustment #565's 2 * target / 3) → equilibrium ≈ 11.5 s (target centered in [8s, 16s))

The screenshot in the description shows oscillation amplitude shrinking, which is a fast-window property; it doesn't pin down the long-run mean. With these parameters the chain will likely produce blocks at ~14 s on average rather than 12 s. Not chain-killing, but it's the difference between "EIP-2 with the divisor scaled correctly" and "EIP-2 with a slightly bias-up divisor."

Recommendation: change target * 5 / 6target * 2 / 3. One-line fix.

2. Stale comments / dead docs

These leak the old EMA-era model and will be confusing to anyone reading the code:

		/// Called at the end of each block to adjust mining difficulty based on
		/// observed block times using Exponential Moving Average (EMA).
		fn on_finalize(block_number: BlockNumberFor<T>) {
	/// Benchmark for the on_finalize hook which performs EMA-based difficulty adjustment.
	#[benchmark]
	fn on_finalize() {
		// EMA smoothing limits the spike, but alpha=500 is aggressive so recovery
		// takes many blocks. 20 normal blocks bring difficulty to ~18% of pre-sleep.
		assert!(
			recovered > pre_sleep / 10,
			"Difficulty should stay above 10% after sleep. Pre: {}, Post: {}",
			pre_sleep.low_u64(),
			recovered.low_u64()

The recovery in the new algorithm is not EMA-limited and there is no alpha=500. With 1 outage block at 3.6 M ms the test actually lands at ≈ 951_688 (≈ 95% of pre-sleep), not 18%. The assertion still passes, but the comment lies about what the test demonstrates.

Similarly stale (all left over from the multiplicative-ratio era):

		// Fast blocks → ratio clamped to 1.1 → floor(1000 * 1.1) = 1100
		// Slow blocks → ratio clamped to 0.9 → floor(1000 * 0.9) = 900, clips to 1000

3. weights.rs still references BlockTimeEma storage that no longer exists

	/// Storage: `QPoW::LastBlockTime` (r:1 w:1)
	/// Proof: `QPoW::LastBlockTime` (`max_values`: Some(1), `max_size`: Some(8), added: 503, mode: `MaxEncodedLen`)
	/// Storage: `QPoW::BlockTimeEma` (r:1 w:1)
	/// Proof: `QPoW::BlockTimeEma` (`max_values`: Some(1), `max_size`: Some(8), added: 503, mode: `MaxEncodedLen`)

Two occurrences. The weight numbers themselves are likely still close-enough since on_finalize did similar work, but the doc-comments lie about what storage is touched. Either regenerate via --features runtime-benchmarks or hand-edit.

4. Grafana dashboards still reference the removed Prometheus series

The PR removes gauge.with_label_values(&["block_time_ema"]).set(…) but two dashboards still query it:

  • miner-stack/grafana/dashboards/quantus-node/overview.jsonqpow_metrics{data_group="block_time_ema", …}
  • miner-stack/grafana/dashboards/quantus-node/quantus-business.jsonqpow_metrics{data_group="block_time_ema"}

After this PR these panels will go blank in production. Either:

5. min_difficulty = 131_072 is a sudden jump for any in-flight chain currently below that

cursor[bot]'s "New floor snaps stored difficulty" comment on pallets/qpow/src/lib.rs:243 is technically correct: if a deployed chain's CurrentDifficulty is below 131_072 at upgrade time, the very next on_finalize clamps up to the floor. With QPoWInitialDifficulty going from 1_189_1894_000_000 (well above 131_072) this is unlikely to bite, but it's worth checking the current live testnet CurrentDifficulty value before deploying. If it's somehow below 131_072 today, a one-block upward snap will happen — which is probably fine and intended, just worth knowing.

cursor[bot]'s second comment ("Minimum traps slow networks") is describing the intentional behavior of a floor — same property Ethereum has. Not a bug.

6. QPoWInitialDifficulty jump 1_189_1894_000_000

The commit message "fix benchmarks and update initial difficulty" doesn't explain the new value. Worth a one-line comment in runtime/src/configs/mod.rs saying "calibrated for current testnet hashrate at 12 s target" or similar — and ideally not bundled into a controller-tuning PR. Same comment n13 made on the earlier 2_700_000 value applies.

7. Tests still don't exercise the runtime's actual divisor

Mock uses TargetBlockTime = 1000 ms → divisor 833 ms.
Runtime uses TargetBlockTime = 12000 ms → divisor 10 000 ms.

The shape is the same so most tests carry over, but if you change the bucket ratio (item 1) the mock will silently keep working and the production behavior could drift again. PR #565 added a quantus_runtime-targeted integration test asserting calculate_difficulty(min, 1, target) > min against the real runtime constants — that's the regression test that would have caught the earlier 1000-floor bug, and it would be worth porting here too.

Re: PR #565

n13 opened #565 as an alternative to this PR, with:

  • Same EIP-2 additive form (so the core algorithm matches what's in 2fa368d)
  • Bucket = target * 2/3 (= 8 s), the canonical Ethereum ratio
  • DifficultyBoundDivisor, BlockTimeBucketMs, MaxUpAdjFactor, MaxDownAdjFactor exposed as runtime configs
  • min_difficulty derived from the divisor (max(2^17, divisor)) rather than hardcoded
  • BlockTimeEma storage preserved as Prometheus telemetry (so the existing dashboards keep working)
  • Fail-loud guards on divisor == 0 and MaxDown > MaxUp
  • Property tests + runtime-parameter integration test

For a quick merge, this PR (#564) is fine as-is once items 1–4 above are addressed. For a more durable foundation, #565's parameterization makes future retuning a single-line constant change rather than a code change. They're not mutually exclusive — you could merge #564 with the bucket fix and then layer the parameterization on top in a follow-up.

Suggested smallest concrete diff

If you want to land #564 cleanly without absorbing #565, the minimal set is:

  1. pallets/qpow/src/lib.rs:212: target_time_ms * 5 / 6target_time_ms * 2 / 3
  2. pallets/qpow/src/lib.rs:110-111: drop the EMA mention from the doc comment
  3. pallets/qpow/src/benchmarking.rs:18: drop the EMA mention
  4. pallets/qpow/src/tests.rs:333-334, 367, 382: rewrite the three stale comments
  5. pallets/qpow/src/weights.rs:66-67, 89-90: drop the BlockTimeEma storage entries (or regenerate)
  6. Either delete the two Grafana block_time_ema panels, or restore the storage as Prometheus-only telemetry
  7. Decide whether the 4_000_000 bump belongs in this PR (and add a one-line justification if it does)

Math-correctness confirmed on the live branch — cargo check --workspace --all-targets and cargo test -p pallet-qpow both pass (22/22) at 2fa368d.

Want me to put together that minimal cleanup diff against illuzen/ema-update-wobble, or would you rather steer the PR toward #565's parameterized version?

@n13
Copy link
Copy Markdown
Collaborator

n13 commented May 21, 2026

Re-review of PR #564 at d563a32 ("addressed nits")

All five concrete cleanup items from my previous review have been addressed and the algorithm itself is now a clean EIP-2 Homestead port. All 22 pallet-qpow tests pass.

Nits successfully addressed

# Nit Status Where
1 Bucket ratio target * 5 / 6target * 2 / 3 Fixed pallets/qpow/src/lib.rs:211
2 Stale EMA mention in on_finalize doc Fixed pallets/qpow/src/lib.rs:110
3 Stale EMA mention in on_finalize benchmark Fixed pallets/qpow/src/benchmarking.rs:18
4 Stale "ratio clamped to 1.1/0.9" / "EMA alpha=500" comments in tests Fixed pallets/qpow/src/tests.rs:333,367,382
5 weights.rs BlockTimeEma storage entries (and read/write counts) Fixed pallets/qpow/src/weights.rs (reads 5→4, writes 4→3)
6 Grafana block_time_ema panels Fixed both overview.json and quantus-business.json now reference last_block_duration

Math spot-check on d563a32

		/// Calculate new difficulty based on block time.
		/// Uses the same formula as Ethereum PoW:
		/// diff = parent_diff + (parent_diff / 2048) * max(1 - block_time / divisor, -99)
		///
		/// The divisor is 10 seconds for a 12s target (scales proportionally).
		/// This creates these zones:
		/// - < divisor: difficulty increases by 1/2048 (~0.05%)
		/// - divisor to 2*divisor: no change
		/// - 2*divisor to 3*divisor: difficulty decreases by 1/2048
		/// - etc, up to max decrease of 99/2048 (~4.8%)
		pub fn calculate_difficulty(
			parent_difficulty: U512,
			block_time_ms: u64,
			target_time_ms: u64,
		) -> U512 {
			log::debug!(target: "qpow", "📊 Calculating new difficulty ---------------------------------------------");

			// Divisor scales with target: 8s divisor for 12s target
			// divisor = target * 8 / 12 = target * 2 / 3
			let divisor_ms = (target_time_ms * 2 / 3).max(1);
			let time_factor = (block_time_ms / divisor_ms) as i64;
			let adjustment = core::cmp::max(1i64 - time_factor, -99i64);
  • Equilibrium block time = divisor / ln 2 = 8 / ln 2 ≈ 11.54 s for the 12 s target — target sits near the middle of the [8s, 16s) no-change band. ✓
  • test_min_difficulty_can_increase: calculate_difficulty(131_072, 1, 1000) → increment = 131_072/2048 = 64, new = 131_136 > min. Floor is escapable. ✓
  • Sleep-recovery test: 1 h gap → adjustment = -99951_688 (≈ 95 % of pre-sleep, well above the pre_sleep / 10 assertion). ✓

Remaining nits (not blocking, single-line fixes if you care)

  1. Stale doc-comment on calculate_difficulty: still says "The divisor is 10 seconds for a 12s target" — should be 8 seconds now. The inline comment two lines below is already correct. One word fix.
  2. QPoWInitialDifficulty bump 2_700_0004_000_000 still has no in-code justification (calibration source / target hashrate). Either drop a one-line comment or split into a separate calibration PR.
  3. Branch is 2 commits behind main (f997cb0e add wormhole doc, e661b3ce fast-governance feature flip). The unrelated docs/wormhole-zk.md, runtime/Cargo.toml, .github/workflows/ci.yml, and runtime/src/governance/definitions.rs lines in the PR diff will disappear on rebase — worth rebasing so the diff is purely the difficulty rework.

Items intentionally skipped (and that's fine)

  • The runtime-parameter integration test (item 7 of my previous review) is now much less important — get_min_difficulty() is a hardcoded constant and the divisor formula scales proportionally, so mock tests now exercise the same code path as the runtime. The original motivation was that the mock used wildly different clamps from production; that no longer applies.

Verdict

LGTM modulo the three minor nits above. The chain-killing floor trap is gone, the algorithm is genuinely EIP-2 Homestead, and the housekeeping (Grafana, weights, docs, benchmarks) is consistent. Happy to approve once the stale "10 seconds" doc-comment is fixed and the branch is rebased.

Copy link
Copy Markdown
Collaborator

@n13 n13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GTG now

@illuzen illuzen merged commit c309a93 into main May 21, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants