Goal
Expand the Evolve simulator into a comprehensive testing and validation tool that runs in CI on every PR. The finished state: CI catches non-determinism, fuzzes critical paths, and tracks performance regressions automatically.
Current State
- Simulator (
evolve_simulator): Seed-based determinism, fault injection, time simulation, basic metrics/reporting. Used in testapp integration tests.
- Fuzzing: Limited to tx encoding (
fuzz_decode, fuzz_roundtrip, fuzz_structured) in crates/app/tx/fuzz/. Requires cargo +nightly fuzz — not integrated into CI.
- CI (
rust.yml): Runs cargo test --workspace on every PR. Long simulation tests exist but are manual-only (workflow_dispatch).
- No non-determinism detection: No automated check that the same seed produces identical state across runs.
Scope
1. Non-Determinism Detection
2. Expanded Fuzzing
3. Performance Testing & Regression Detection
4. CI Integration
5. Simulator Enhancements
Success Criteria
- Every PR runs simulation tests (short suite, <5 min) + non-determinism dual-execution check.
- Nightly CI job fuzzes STF, storage, and mempool for 30+ min and files issues on findings.
- Performance benchmarks run nightly with regression detection — degradations >10% are flagged.
- A failing simulation always prints its reproduction command.
- Zero known non-determinism sources in the STF execution path.
Implementation Notes
- Start with non-determinism detection (highest value, lowest effort) — dual-execution is just "run twice, compare hashes."
- For fuzzing, consider
bolero as it works with both libfuzzer and proptest backends, avoiding the nightly-only cargo-fuzz limitation.
- Performance baselines can use GitHub Actions artifacts or
git notes for storage.
- Keep CI wall time in check — simulation and fuzzing are useless if they make PRs slow. Short suite on PR, long suite on nightly.
Goal
Expand the Evolve simulator into a comprehensive testing and validation tool that runs in CI on every PR. The finished state: CI catches non-determinism, fuzzes critical paths, and tracks performance regressions automatically.
Current State
evolve_simulator): Seed-based determinism, fault injection, time simulation, basic metrics/reporting. Used in testapp integration tests.fuzz_decode,fuzz_roundtrip,fuzz_structured) incrates/app/tx/fuzz/. Requirescargo +nightly fuzz— not integrated into CI.rust.yml): Runscargo test --workspaceon every PR. Long simulation tests exist but are manual-only (workflow_dispatch).Scope
1. Non-Determinism Detection
HashMap/HashSetusage leaks into STF execution paths (beyond the existing clippy lint — runtime verification).SystemTime/Instantusage reaches STF execution. Simulator'sSimulatedTimeshould be the only time source during execution.2. Expanded Fuzzing
apply_blockpath with randomly generated blocks (random tx ordering, random payloads, malformed inputs). Assert no panics, no state corruption.3. Performance Testing & Regression Detection
PerformanceReportwith p50/p95/p99 latencies, throughput (tx/s, blocks/s), and memory high-water mark.4. CI Integration
just sim-seed <seed>command to reproduce locally.5. Simulator Enhancements
Success Criteria
Implementation Notes
boleroas it works with both libfuzzer and proptest backends, avoiding the nightly-onlycargo-fuzzlimitation.git notesfor storage.