Skip to content

audit: add LEP-6 foundation storage-truth scaffolding#117

Merged
mateeullahmalik merged 7 commits intomasterfrom
LEP-6-foundation
Apr 30, 2026
Merged

audit: add LEP-6 foundation storage-truth scaffolding#117
mateeullahmalik merged 7 commits intomasterfrom
LEP-6-foundation

Conversation

@j-rafique
Copy link
Copy Markdown
Contributor

@j-rafique j-rafique commented Apr 20, 2026

Summary

Introduces LEP-6 foundation scaffolding in lumera as an additive, behavior-neutral change set.

This PR establishes the on-chain protocol/state/query surface needed for later LEP-6 milestones, while intentionally keeping current runtime behavior unchanged.

Scope

  • Repo: lumera only
  • Branch: LEP-6-foundation
  • No enforcement activation, no scoring behavior switch, no heal/recheck runtime execution yet

What This PR Adds

1) Proto/API foundation (additive)

  • Added LEP-6 storage-truth models and enums in audit proto
  • Added storage_proof_results to EpochReport (compat field retained)
  • Added new tx message surfaces:
    • SubmitStorageRecheckEvidence
    • ClaimHealComplete
    • SubmitHealVerification
  • Added LEP-6 query surfaces for:
    • node suspicion
    • reporter reliability
    • ticket deterioration
    • heal-op (single and list)
  • Added LEP-6 params and genesis state surfaces, including heal-op id counter

2) State + keeper wiring

  • Added typed state records and CRUD helpers for:
    • Node suspicion state
    • Reporter reliability state
    • Ticket deterioration state
    • Heal-op state + status
  • Added store prefixes and deterministic key/index wiring
  • Added query server wiring for all new LEP-6 query endpoints
  • Added genesis import/export support for all new LEP-6 state collections

3) Params/defaults

  • Added defaults and validation for new LEP-6 params
  • Defaults are no-op/safe to preserve current chain behavior

4) Message handlers (foundation-only)

  • Registered new message interfaces
  • Added behavior-neutral placeholder handlers returning ErrNotImplemented (runtime logic deferred to later PRs)

Follow-up Fixes Applied After Review

Addressed PR review comments:

  • Nil request validation in placeholder msg handlers now returns ErrInvalidRequest (instead of ErrInvalidSigner)
  • Removed unintended self-referential lumera checksum entry from devnet/go.sum

Compatibility / Risk

  • Additive and non-breaking
  • Existing active flows remain unchanged (MsgSubmitEpochReport, existing enforcement/evidence paths)

Testing

  • Added/updated keeper, query, params, genesis, and simulation-oriented coverage for introduced foundation surfaces
  • Verified

@j-rafique j-rafique self-assigned this Apr 20, 2026
@roomote-v0
Copy link
Copy Markdown

roomote-v0 Bot commented Apr 20, 2026

Rooviewer Clock   Follow task

Re-reviewed at 868cbc7. This commit adds enforcement matrix predicates, recovery logic, divergence detection, heal-op scheduling/verification, secondary indexes for indexed lookups, and comprehensive pruning of all new state. Integer-only arithmetic throughout for consensus safety, overflow protection via math/big, and proper decay/clamping/saturation. Two previously resolved issues remain fixed. No new issues found. One prior issue remains open (simulation coverage gap for the Everlight payout weighting path).

  • Placeholder msg handlers use ErrInvalidSigner for nil-request validation -- semantically misleading error code for clients
  • devnet/go.sum adds a self-referential LumeraProtocol/lumera v1.10.0 checksum entry -- likely dead weight from go mod tidy
  • Simulation removes CascadeKademliaDbBytes from HostReport in submit_epoch_report_variance.go -- drops coverage of the Everlight payout weighting path
Previous reviews

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.

Comment thread x/audit/v1/keeper/msg_storage_truth_placeholders.go Outdated
Comment thread devnet/go.sum Outdated
roomote-v0[bot]
roomote-v0 Bot previously approved these changes Apr 22, 2026
roomote-v0[bot]
roomote-v0 Bot previously approved these changes Apr 22, 2026
roomote-v0[bot]
roomote-v0 Bot previously approved these changes Apr 22, 2026
roomote-v0[bot]
roomote-v0 Bot previously approved these changes Apr 22, 2026
@mateeullahmalik

This comment was marked as resolved.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces LEP-6 “storage-truth” foundation scaffolding in the x/audit module, adding proto surfaces, state storage, queries, params, and genesis wiring while keeping runtime behavior intentionally unchanged (handlers are placeholders returning ErrNotImplemented).

Changes:

  • Added storage-truth proto models/enums plus new query + tx RPC surfaces (gateway + AutoCLI wiring included).
  • Added keeper/state scaffolding for storage-truth records (node suspicion, reporter reliability, ticket deterioration, heal-ops) including indexes and genesis import/export.
  • Added new module params defaults + validation and accompanying unit tests/simulation-oriented tests.

Reviewed changes

Copilot reviewed 27 out of 29 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
x/audit/v1/types/query.pb.gw.go Adds grpc-gateway HTTP handlers/patterns for new storage-truth query endpoints.
x/audit/v1/types/params_test.go Adds tests covering new storage-truth param defaults and validation failures.
x/audit/v1/types/params.pb.go Regenerates params protobuf Go output to include new storage-truth fields + enum.
x/audit/v1/types/params.go Introduces storage-truth params, defaults, ParamSetPairs wiring, and validation rules.
x/audit/v1/types/keys.go Adds KV prefixes/keys for storage-truth state and heal-op indexes/counter.
x/audit/v1/types/genesis.pb.go Regenerates genesis protobuf Go output to include storage-truth state arrays + heal-op counter.
x/audit/v1/types/genesis.go Sets default NextHealOpId in the module’s default genesis.
x/audit/v1/types/errors.go Adds ErrNotImplemented for foundation-only placeholder handlers.
x/audit/v1/types/codec.go Registers new storage-truth Msg types in the interface registry.
x/audit/v1/simulation/submit_evidence_test.go Adds test asserting SubmitEvidence simulation op is a NoOp.
x/audit/v1/simulation/submit_epoch_report_variance.go Removes generation of a field from simulated epoch report variance host report input.
x/audit/v1/module/simulation_test.go Adds test asserting simulation weighted ops include SubmitEvidence.
x/audit/v1/module/autocli.go Adds AutoCLI query/tx command wiring for new storage-truth endpoints/messages.
x/audit/v1/keeper/storage_truth_state_test.go Adds round-trip tests for new storage-truth keeper state CRUD and heal-op indexing behavior.
x/audit/v1/keeper/storage_truth_state.go Adds keeper CRUD for storage-truth state, heal-op indexes, and next heal-op ID counter.
x/audit/v1/keeper/query_storage_truth_test.go Adds query-server tests for storage-truth queries and pagination/listing behavior.
x/audit/v1/keeper/query_storage_truth.go Implements query-server endpoints for storage-truth states and heal-op retrieval/listing.
x/audit/v1/keeper/msg_storage_truth_placeholders_test.go Adds tests for new Msg handlers returning invalid-request/invalid-signer and not-implemented errors.
x/audit/v1/keeper/msg_storage_truth_placeholders.go Adds behavior-neutral placeholder Msg handlers returning ErrNotImplemented.
x/audit/v1/keeper/genesis_test.go Extends genesis round-trip tests to cover new storage-truth genesis surfaces.
x/audit/v1/keeper/genesis.go Adds init/export logic for storage-truth state collections and heal-op ID counter.
proto/lumera/audit/v1/tx.proto Adds new storage-truth tx RPCs and message definitions.
proto/lumera/audit/v1/query.proto Adds new storage-truth query RPCs and request/response messages.
proto/lumera/audit/v1/params.proto Adds StorageTruthEnforcementMode enum and storage-truth params fields.
proto/lumera/audit/v1/genesis.proto Adds storage-truth state collections and heal-op ID counter to genesis.
proto/lumera/audit/v1/audit.proto Adds storage-truth result/state models, heal-op status/state, and extends EpochReport with storage_proof_results.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread x/audit/v1/keeper/storage_truth_state.go Outdated
Comment thread x/audit/v1/types/params.go
Comment thread x/audit/v1/simulation/submit_epoch_report_variance.go
…ess (#122)

CRITICAL
- 121-F16: Replace float64 neg-rate arithmetic with integer cross-multiplication
  in ApplyReporterDivergenceAtEpochEnd to eliminate consensus non-determinism
  across validators (different IEEE-754 implementations could produce divergent
  sort orders and median values).
- 121-F15: Remove stale WindowPositiveCount/WindowNegativeCount fallback in
  divergence stats; divergence now exclusively reads per-record KV data.

HIGH
- 121-F1: RECHECK bucket results bypass storageTruthBookkeepingForResult to
  prevent double-applying the contradiction penalty already handled in
  SubmitStorageRecheckEvidence.
- 121-F2: Use challengedRecord.ReporterAccount (authoritative) instead of
  ticketState.LastReporterSupernodeAccount when targeting the original reporter
  for overturn/confirm scoring in SubmitStorageRecheckEvidence.
- 121-F4: Add time-window guard to StrongPostpone index-failure predicate;
  LastIndexFailEpoch must be within classAWindow epochs (not unbounded).
- 121-F13: Fix inverted eligibility predicate in GetAssignmentEligibleReporters:
  was AND-ing score threshold with ineligibility window, should be OR.
- 120-F3 / 121-F6: Add epoch-window pruning for recheck-evidence dedup keys
  (st/rce/), node failure records (st/nf/), reporter result records (st/rrs/),
  and failed-heal records (st/fh/) in PruneOldEpochs.
- 121-F7: StorageTruthPostponementKey and RecheckEvidence round-trip verified
  via existing tests; genesis wiring confirmed correct.

MEDIUM
- 119-Copilot-3: Fix first-failure-at-epoch-0 predicate in
  applyTicketDeteriorationDelta; use (!found || epochID != state.LastFailureEpoch)
  so epoch-0 failures are properly counted.
- 119-F7: Same-epoch contradiction check uses <= instead of < so two reporters
  in the same epoch can contradict each other.
- 121-F3 / 119-F5: Remove spurious currentReporterPenalty = -4 on contradiction
  detection; the PASS score delta already provides the -4 recovery signal.
- 121-F12: storageTruthBandStrongPostpone emits a distinct event type
  (storage_truth_band_strong_postpone_candidate) instead of reusing the regular
  postpone event type.
- 122-F3: linkStorageTruthRecheckTranscript now validates that an existing
  recheck transcript record was created by the same challenged hash and reporter
  before silently returning nil (prevents silent hash-collision bypass).
- 122-Copilot-2: Recheck transcript record no longer copies
  DerivationInputHash, ChallengerSignature, ObserverAttestations from the
  challenged record; these fields are zeroed in synthetic recheck records.
- 122-Roomote-B: linkStorageTruthRecheckTranscript uses errorsmod.Wrapf instead
  of fmt.Errorf so errors carry the correct SDK error code for clients.
- 122-Copilot-8: SetStorageTruthTicketArtifactCounts uses errorsmod.Wrap
  instead of fmt.Errorf throughout.

LOW
- 121-Copilot-1: Move SetRecheckEvidence call after linkStorageTruthRecheckTranscript
  succeeds so that a transcript-link failure does not permanently consume the
  per-(epoch, ticket, creator) dedup slot.
- 120-Copilot-4: ClaimHealComplete emits AttributeKeyHealManifestHash attribute
  instead of the generic AttributeKeyTranscriptHash; new constant added to
  types/events.go.
- 120-Copilot-5: Broaden the events.go file comment to cover all event types.
- 121-Roomote-C: Remove dead mulInt64ByUint64Saturated helper (unreachable code).

Previously landed (prior sessions, squashed here):
- Params fix: WithDefaults preserves UNSPECIFIED enforcement mode instead of
  promoting it to SHADOW; Validate accepts UNSPECIFIED as a valid no-op mode.
- Cascade bytes moved from HostReport to SupernodeMetricsState.Metrics (LEP-6 §12).
- AutoCLI: add EpochAnchor, CurrentEpochAnchor, AssignedTargets entries; skip
  float64-heavy RPC methods (EpochReport, EpochReportsByReporter, HostReports).
- enforcement_empty_active_set_test.go: update mock expectations to Active-only
  GetAllSuperNodes call (LEP-6 §17).
- distribution_freshness_test / query_get_reward_eligibility_test: addSupernode
  before height advance so MetricsState.Height is correctly set to initial height.
- e2e and integration everlight tests: remove stale CascadeKademliaDbBytes from
  HostReport literal; call SetMetricsState for cascade bytes instead.

All 44 packages pass (go test ./... -count=1 -timeout=300s).
roomote-v0[bot]
roomote-v0 Bot previously approved these changes Apr 27, 2026
Copy link
Copy Markdown

@roomote-v0 roomote-v0 Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed at 868cbc7. Two previously flagged issues are resolved. No new issues found in this commit. The remaining open item (simulation coverage for the Everlight payout weighting path) is a test-coverage gap, not a blocking defect. Approving.

mateeullahmalik

This comment was marked as resolved.

j-rafique added a commit that referenced this pull request Apr 28, 2026
…resolved

This commit closes ALL 24 findings from Zee's round-2 production-gate
review of PR #117 (review id 4184561676, against tip 868cbc7).

HIGH (4/4):
- NEW-C-3: restore RegistrationFeeShareBps 2% fee-routing block in
  x/action/keeper.DistributeFees that LEP-6 consolidation silently
  deleted. Re-add RewardDistributionKeeper interface, keeper field,
  ctor arg, depinject wiring; routing precedes foundation share
  (matches master).
- NEW-C-1: ExportGenesis now round-trips 8 epoch-scoped audit-state
  prefix families (st/rce, st/nf, st/rrs, st/spt, st/fh, r/, hr/, sc/).
  Adds proto wrapper messages, GetAll* iterators, InitGenesis re-emits
  via existing Set* writers (secondary indexes rebuild naturally).
- NEW-A-12 + NEW-A-17: WindowStartEpoch underflow at scoring window
  resets. Replace raw uint64 subtraction with epochDelta() at
  storage_truth_scoring.go:256/332. ValidateScoreStatesGenesis rejects
  WindowStartEpoch > currentEpoch on both score-state slices.
- NEW-B-1: EXPIRED heal-ops now apply §20 no-show cooldown
  (DeteriorationScore += 15, ProbationUntilEpoch advanced) and write
  st/fh/ failed-heal facts, mirroring the FAILED branch. Prevents
  same-ticket re-schedule loop on silent-healer scenarios.

MEDIUM (9/9):
- NEW-C-2: FinalizeAction audit hook now uses
  CascadeArtifactCountsWithFallback helper (single source of truth
  across Process, GetUpdatedMetadata, FinalizeAction).
- NEW-A-11 residue: bounded epoch-range scans for
  storageTruthReporterDivergenceStats and distinctNodeFailedTickets
  via NodeStorageTruthFailureEpochScanRange and
  ReporterStorageTruthResultEpochScanRange (key shape unchanged).
- NEW-A-13: divergence cross-multiply uses big.Int (overflow-safe).
- NEW-A-14 + NEW-A-15: trust multiplier limited to Class A pre-recheck
  (HASH_MISMATCH OR INDEX-artifact); pattern-escalation bonuses no
  longer scaled (Class B/C natural-fix once predicate narrowed).
- NEW-A-18: per-result PASS/TIMEOUT reporterReliability delta = 0;
  ApplyReporterCleanEpochRecoveryAtEpochEnd applies single -4 once at
  epoch-end on >=5 PASS with no overturned fails (spec §15.3).
- NEW-B-2: healer-eligibility uses decayTowardZero(SuspicionScore, ...)
  matching enforcement.go sibling-symmetry.
- NEW-B-8: finalizeHealOp verified branch resets DistinctHolderFailureCount,
  RecentFailureEpochCount, LastIndexFailureEpoch, LastFailureEpoch to
  restore §20 fresh-start semantic post-heal.
- F121-F12: distinct postpone reason 'audit_storage_truth_strong_suspicion'
  for the strong band + new param StorageTruthStrongRecoveryCleanPassCount
  (default 5) + ap/sts/ strong-marker store key. Recovery selects the
  required pass count based on the persisted reason.
- F121-F10/F119-F3: ticket ContradictionCount bumps now confirmation-guarded
  via contradictionConfirmed bool param (mirrors reporter-side guard).

LOW (11/11):
- NEW-A-15, NEW-A-17: auto-closed by NEW-A-14 / NEW-A-12 fixes.
- NEW-A-16: median-of-even uses upper-pair (more conservative).
- NEW-B-3: verifierCount promoted to StorageTruthHealVerifierCount param
  (default 2). Allows governance tuning per network conditions.
- NEW-B-4: emit EventTypeHealOpInsufficientVerifiers when verifier pool
  is empty (sibling-symmetry with InsufficientHealers).
- NEW-B-5: linkStorageTruthRecheckTranscript carries doc comment
  clarifying per-creator single-witness uniqueness at link time vs
  cross-creator quorum at scoring time.
- NEW-B-6 + NEW-B-9: InitGenesis cross-validates audit
  StorageTruthPostponements against supernode SuperNodeStatePostponed;
  rejects mismatched state with descriptive error.
- NEW-B-7: GetNextHealOpID panic-guards on malformed state and id==0
  (sibling-symmetry with GetNextEvidenceID).
- NEW-C-4 / NEW-A-19: pruneStorageProofTranscripts logs malformed
  records via k.Logger().Error() so silent corruption is observable.
- F119-F3 residue: cross-holder PASS bonus implemented in
  applyTicketDeteriorationDelta — when PASS lands on a ticket whose
  prior-holder state recorded a failure from a DIFFERENT holder, an
  additional -3 ticket-deterioration delta is applied on top of the
  base bucket reduction. Predicate: result.ResultClass == PASS AND
  prior state.LastTargetSupernodeAccount != result.TargetSupernodeAccount
  AND isStorageTruthFailureClass(state.LastResultClass). Tests cover
  cross-holder (-5 total), same-holder (-2 only), fresh-ticket (no-op),
  and prior-PASS (no bonus) cases.
- NF7: workspace/docs/LEP6.md pair_rank wording updated to canonical
  0x00-framed form (matches in-repo implementation guide).

Tests:
- Build clean. ./x/... unit green. Module-level simulation green.
  ./tests/integration/... (audit, action, everlight, bank, staking, wasm,
  supernode, gov) green. ./tests/system/... (-tags=system) green.
  Systemtests vet (-tags=system_test) clean. e2e systemtests
  (-tags=system_test, 30min cap) green: 25/25 PASS.
- New unit tests covering: fee-routing, genesis round-trip, window
  safety, expire cooldown, decay-adjusted heal eligibility, panic guards
  on malformed counter state, genesis cross-validation, clean-epoch
  recovery, scoring delta zero (PASS/TIMEOUT per-result), F119-F3
  cross-holder PASS bonus (4 sub-tests).
- Fixture updates per CP-policy with explicit 'Per <CP-id>' citations.

Verification artefacts: /tmp/lep6-r2-fix/.r2/track{1..4}_*.md
Plan: docs/plans/LEP6_REVIEW_R2_FIX_PLAN.md
j-rafique added a commit that referenced this pull request Apr 28, 2026
…resolved

This commit closes ALL 24 findings from Zee's round-2 production-gate
review of PR #117 (review id 4184561676, against tip 868cbc7).

HIGH (4/4):
- NEW-C-3: restore RegistrationFeeShareBps 2% fee-routing block in
  x/action/keeper.DistributeFees that LEP-6 consolidation silently
  deleted. Re-add RewardDistributionKeeper interface, keeper field,
  ctor arg, depinject wiring; routing precedes foundation share
  (matches master).
- NEW-C-1: ExportGenesis now round-trips 8 epoch-scoped audit-state
  prefix families (st/rce, st/nf, st/rrs, st/spt, st/fh, r/, hr/, sc/).
  Adds proto wrapper messages, GetAll* iterators, InitGenesis re-emits
  via existing Set* writers (secondary indexes rebuild naturally).
- NEW-A-12 + NEW-A-17: WindowStartEpoch underflow at scoring window
  resets. Replace raw uint64 subtraction with epochDelta() at
  storage_truth_scoring.go:256/332. ValidateScoreStatesGenesis rejects
  WindowStartEpoch > currentEpoch on both score-state slices.
- NEW-B-1: EXPIRED heal-ops now apply §20 no-show cooldown
  (DeteriorationScore += 15, ProbationUntilEpoch advanced) and write
  st/fh/ failed-heal facts, mirroring the FAILED branch. Prevents
  same-ticket re-schedule loop on silent-healer scenarios.

MEDIUM (9/9):
- NEW-C-2: FinalizeAction audit hook now uses
  CascadeArtifactCountsWithFallback helper (single source of truth
  across Process, GetUpdatedMetadata, FinalizeAction).
- NEW-A-11 residue: bounded epoch-range scans for
  storageTruthReporterDivergenceStats and distinctNodeFailedTickets
  via NodeStorageTruthFailureEpochScanRange and
  ReporterStorageTruthResultEpochScanRange (key shape unchanged).
- NEW-A-13: divergence cross-multiply uses big.Int (overflow-safe).
- NEW-A-14 + NEW-A-15: trust multiplier limited to Class A pre-recheck
  (HASH_MISMATCH OR INDEX-artifact); pattern-escalation bonuses no
  longer scaled (Class B/C natural-fix once predicate narrowed).
- NEW-A-18: per-result PASS/TIMEOUT reporterReliability delta = 0;
  ApplyReporterCleanEpochRecoveryAtEpochEnd applies single -4 once at
  epoch-end on >=5 PASS with no overturned fails (spec §15.3).
- NEW-B-2: healer-eligibility uses decayTowardZero(SuspicionScore, ...)
  matching enforcement.go sibling-symmetry.
- NEW-B-8: finalizeHealOp verified branch resets DistinctHolderFailureCount,
  RecentFailureEpochCount, LastIndexFailureEpoch, LastFailureEpoch to
  restore §20 fresh-start semantic post-heal.
- F121-F12: distinct postpone reason 'audit_storage_truth_strong_suspicion'
  for the strong band + new param StorageTruthStrongRecoveryCleanPassCount
  (default 5) + ap/sts/ strong-marker store key. Recovery selects the
  required pass count based on the persisted reason.
- F121-F10/F119-F3: ticket ContradictionCount bumps now confirmation-guarded
  via contradictionConfirmed bool param (mirrors reporter-side guard).

LOW (11/11):
- NEW-A-15, NEW-A-17: auto-closed by NEW-A-14 / NEW-A-12 fixes.
- NEW-A-16: median-of-even uses upper-pair (more conservative).
- NEW-B-3: verifierCount promoted to StorageTruthHealVerifierCount param
  (default 2). Allows governance tuning per network conditions.
- NEW-B-4: emit EventTypeHealOpInsufficientVerifiers when verifier pool
  is empty (sibling-symmetry with InsufficientHealers).
- NEW-B-5: linkStorageTruthRecheckTranscript carries doc comment
  clarifying per-creator single-witness uniqueness at link time vs
  cross-creator quorum at scoring time.
- NEW-B-6 + NEW-B-9: InitGenesis cross-validates audit
  StorageTruthPostponements against supernode SuperNodeStatePostponed;
  rejects mismatched state with descriptive error.
- NEW-B-7: GetNextHealOpID panic-guards on malformed state and id==0
  (sibling-symmetry with GetNextEvidenceID).
- NEW-C-4 / NEW-A-19: pruneStorageProofTranscripts logs malformed
  records via k.Logger().Error() so silent corruption is observable.
- F119-F3 residue: cross-holder PASS bonus implemented in
  applyTicketDeteriorationDelta — when PASS lands on a ticket whose
  prior-holder state recorded a failure from a DIFFERENT holder, an
  additional -3 ticket-deterioration delta is applied on top of the
  base bucket reduction. Predicate: result.ResultClass == PASS AND
  prior state.LastTargetSupernodeAccount != result.TargetSupernodeAccount
  AND isStorageTruthFailureClass(state.LastResultClass). Tests cover
  cross-holder (-5 total), same-holder (-2 only), fresh-ticket (no-op),
  and prior-PASS (no bonus) cases.
- NF7: workspace/docs/LEP6.md pair_rank wording updated to canonical
  0x00-framed form (matches in-repo implementation guide).

Tests:
- Build clean. ./x/... unit green. Module-level simulation green.
  ./tests/integration/... (audit, action, everlight, bank, staking, wasm,
  supernode, gov) green. ./tests/system/... (-tags=system) green.
  Systemtests vet (-tags=system_test) clean. e2e systemtests
  (-tags=system_test, 30min cap) green: 25/25 PASS.
- New unit tests covering: fee-routing, genesis round-trip, window
  safety, expire cooldown, decay-adjusted heal eligibility, panic guards
  on malformed counter state, genesis cross-validation, clean-epoch
  recovery, scoring delta zero (PASS/TIMEOUT per-result), F119-F3
  cross-holder PASS bonus (4 sub-tests).
- Fixture updates per CP-policy with explicit 'Per <CP-id>' citations.

Verification artefacts: /tmp/lep6-r2-fix/.r2/track{1..4}_*.md
Plan: docs/plans/LEP6_REVIEW_R2_FIX_PLAN.md
…resolved (#127)

This commit closes ALL 24 findings from Zee's round-2 production-gate
review of PR #117 (review id 4184561676, against tip 868cbc7).

HIGH (4/4):
- NEW-C-3: restore RegistrationFeeShareBps 2% fee-routing block in
  x/action/keeper.DistributeFees that LEP-6 consolidation silently
  deleted. Re-add RewardDistributionKeeper interface, keeper field,
  ctor arg, depinject wiring; routing precedes foundation share
  (matches master).
- NEW-C-1: ExportGenesis now round-trips 8 epoch-scoped audit-state
  prefix families (st/rce, st/nf, st/rrs, st/spt, st/fh, r/, hr/, sc/).
  Adds proto wrapper messages, GetAll* iterators, InitGenesis re-emits
  via existing Set* writers (secondary indexes rebuild naturally).
- NEW-A-12 + NEW-A-17: WindowStartEpoch underflow at scoring window
  resets. Replace raw uint64 subtraction with epochDelta() at
  storage_truth_scoring.go:256/332. ValidateScoreStatesGenesis rejects
  WindowStartEpoch > currentEpoch on both score-state slices.
- NEW-B-1: EXPIRED heal-ops now apply §20 no-show cooldown
  (DeteriorationScore += 15, ProbationUntilEpoch advanced) and write
  st/fh/ failed-heal facts, mirroring the FAILED branch. Prevents
  same-ticket re-schedule loop on silent-healer scenarios.

MEDIUM (9/9):
- NEW-C-2: FinalizeAction audit hook now uses
  CascadeArtifactCountsWithFallback helper (single source of truth
  across Process, GetUpdatedMetadata, FinalizeAction).
- NEW-A-11 residue: bounded epoch-range scans for
  storageTruthReporterDivergenceStats and distinctNodeFailedTickets
  via NodeStorageTruthFailureEpochScanRange and
  ReporterStorageTruthResultEpochScanRange (key shape unchanged).
- NEW-A-13: divergence cross-multiply uses big.Int (overflow-safe).
- NEW-A-14 + NEW-A-15: trust multiplier limited to Class A pre-recheck
  (HASH_MISMATCH OR INDEX-artifact); pattern-escalation bonuses no
  longer scaled (Class B/C natural-fix once predicate narrowed).
- NEW-A-18: per-result PASS/TIMEOUT reporterReliability delta = 0;
  ApplyReporterCleanEpochRecoveryAtEpochEnd applies single -4 once at
  epoch-end on >=5 PASS with no overturned fails (spec §15.3).
- NEW-B-2: healer-eligibility uses decayTowardZero(SuspicionScore, ...)
  matching enforcement.go sibling-symmetry.
- NEW-B-8: finalizeHealOp verified branch resets DistinctHolderFailureCount,
  RecentFailureEpochCount, LastIndexFailureEpoch, LastFailureEpoch to
  restore §20 fresh-start semantic post-heal.
- F121-F12: distinct postpone reason 'audit_storage_truth_strong_suspicion'
  for the strong band + new param StorageTruthStrongRecoveryCleanPassCount
  (default 5) + ap/sts/ strong-marker store key. Recovery selects the
  required pass count based on the persisted reason.
- F121-F10/F119-F3: ticket ContradictionCount bumps now confirmation-guarded
  via contradictionConfirmed bool param (mirrors reporter-side guard).

LOW (11/11):
- NEW-A-15, NEW-A-17: auto-closed by NEW-A-14 / NEW-A-12 fixes.
- NEW-A-16: median-of-even uses upper-pair (more conservative).
- NEW-B-3: verifierCount promoted to StorageTruthHealVerifierCount param
  (default 2). Allows governance tuning per network conditions.
- NEW-B-4: emit EventTypeHealOpInsufficientVerifiers when verifier pool
  is empty (sibling-symmetry with InsufficientHealers).
- NEW-B-5: linkStorageTruthRecheckTranscript carries doc comment
  clarifying per-creator single-witness uniqueness at link time vs
  cross-creator quorum at scoring time.
- NEW-B-6 + NEW-B-9: InitGenesis cross-validates audit
  StorageTruthPostponements against supernode SuperNodeStatePostponed;
  rejects mismatched state with descriptive error.
- NEW-B-7: GetNextHealOpID panic-guards on malformed state and id==0
  (sibling-symmetry with GetNextEvidenceID).
- NEW-C-4 / NEW-A-19: pruneStorageProofTranscripts logs malformed
  records via k.Logger().Error() so silent corruption is observable.
- F119-F3 residue: cross-holder PASS bonus implemented in
  applyTicketDeteriorationDelta — when PASS lands on a ticket whose
  prior-holder state recorded a failure from a DIFFERENT holder, an
  additional -3 ticket-deterioration delta is applied on top of the
  base bucket reduction. Predicate: result.ResultClass == PASS AND
  prior state.LastTargetSupernodeAccount != result.TargetSupernodeAccount
  AND isStorageTruthFailureClass(state.LastResultClass). Tests cover
  cross-holder (-5 total), same-holder (-2 only), fresh-ticket (no-op),
  and prior-PASS (no bonus) cases.
- NF7: workspace/docs/LEP6.md pair_rank wording updated to canonical
  0x00-framed form (matches in-repo implementation guide).

Tests:
- Build clean. ./x/... unit green. Module-level simulation green.
  ./tests/integration/... (audit, action, everlight, bank, staking, wasm,
  supernode, gov) green. ./tests/system/... (-tags=system) green.
  Systemtests vet (-tags=system_test) clean. e2e systemtests
  (-tags=system_test, 30min cap) green: 25/25 PASS.
- New unit tests covering: fee-routing, genesis round-trip, window
  safety, expire cooldown, decay-adjusted heal eligibility, panic guards
  on malformed counter state, genesis cross-validation, clean-epoch
  recovery, scoring delta zero (PASS/TIMEOUT per-result), F119-F3
  cross-holder PASS bonus (4 sub-tests).
- Fixture updates per CP-policy with explicit 'Per <CP-id>' citations.

Verification artefacts: /tmp/lep6-r2-fix/.r2/track{1..4}_*.md
Plan: docs/plans/LEP6_REVIEW_R2_FIX_PLAN.md
Copy link
Copy Markdown
Contributor

@mateeullahmalik mateeullahmalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round-2 verification by Zee — 8748065 (commit #127)

I re-checked every R2 finding against the new tip (874806581b77b13fe26ba9fc9f5e8d6887b387bf, commit 8748065 fix: close Zee R2 review). Verification done by reading the actual code at the cited file:line, not by trusting the commit message.

Result

24/24 R2 findings: VERIFIED FIXED. Cleared to merge from my side.

What I confirmed (HIGH-severity items)

Finding Verification
NEW-C-3 restore RegistrationFeeShareBps 2% routing x/action/v1/keeper/action.go:651-664 — block restored verbatim. keeper.go:39,61,90,156rewardDistributionKeeper field/ctor/getter restored. ✅
NEW-C-1 ExportGenesis epoch-scoped state genesis.go:210-224 exports RecheckEvidence, StorageProofTranscripts, NodeFailureFacts, ReporterResultFacts, FailedHealMarkers, EpochReports, ReportIndices, HostReportIndices, StorageChallengeIndices — all 9 prefix families. ImportGenesis re-emits via existing Set* writers (secondary indexes rebuild naturally via setStorageProofTranscriptRecord). ✅
NEW-A-12 WindowStartEpoch underflow storage_truth_scoring.go:262, 338 use epochDelta(epochID, state.WindowStartEpoch) >= window. genesis_validate.go:13-15, 26-28 reject WindowStartEpoch > currentEpoch for both score-state slices. ✅
NEW-B-1 EXPIRED heal cooldown storage_truth_heal_ops.go:43-58 — D += 15 (saturated), ProbationUntilEpoch advanced, setStorageTruthFailedHeal written, mirrors FAILED branch exactly. ✅

What I confirmed (MEDIUM)

  • NEW-C-2 SSoT helper CascadeArtifactCountsWithFallback in x/action/v1/types/metadata.go, used at all 3 sites (action.go:253, action_cascade.go:151,327).
  • NEW-A-11 residue new NodeStorageTruthFailureEpochScanRange and ReporterStorageTruthResultEpochScanRange (keys.go:558-590) used in both distinctNodeFailedTickets (fact_indexes.go:262) and storageTruthReporterDivergenceStats (divergence.go:221). Bounded [startEpoch, endEpoch+1) scans, key shape unchanged.
  • NEW-A-13 *big.Int cross-multiply for sort + 2x threshold (divergence.go:68-69, 90-97).
  • NEW-A-14 applyTrustScaling gated on isClassA := HASH_MISMATCH || INDEX-artifact (scoring.go:574-579). Pattern-escalation bonuses (nodeBonus/ticketBonus) added unscaled.
  • NEW-A-18 PASS/TIMEOUT per-result reporterReliability = 0; ApplyReporterCleanEpochRecoveryAtEpochEnd (divergence.go:149) emits single −4 once per reporter at epoch-end on >=5 PASSes with no overturned fails. Wired at abci.go:61. Uses bounded ReporterStorageTruthResultEpochScanRange.
  • NEW-B-2 healer eligibility uses decayTowardZero(SuspicionScore, decay, epochDelta(epochID, ss.LastUpdatedEpoch)) (heal_ops.go:117-119) — sibling-symmetric with enforcement.go:119.
  • NEW-B-8 verified-heal branch resets DistinctHolderFailureCount, RecentFailureEpochCount, LastIndexFailureEpoch, LastFailureEpoch (msg_storage_truth.go:405-409).
  • F121-F12 distinct postponeReasonStorageTruthStrong (enforcement.go:18); strong marker ap/sts/<acct> written on entry (enforcement.go:168); recovery selects between StorageTruthRecoveryCleanPassCount and new StorageTruthStrongRecoveryCleanPassCount based on persisted reason (enforcement.go:618-619). New params have defaults (140 / 5) and Validate() enforces Postpone < StrongPostpone (params.go:737).
  • F121-F10 ticket-side ContradictionCount increment now guarded by contradictionConfirmed bool parameter passed into applyTicketDeteriorationDelta (scoring.go:441-446).

LOW (also verified)

  • NEW-A-16 median-of-even now uses upper-pair (more conservative) — divergence.go:74-80, with comment.
  • NEW-B-3 StorageTruthHealVerifierCount param (default 2), used in heal_ops.go:323-328.
  • NEW-B-4 EventTypeHealOpInsufficientVerifiers emitted (heal_ops.go:238; events.go:22).
  • NEW-B-6/B-9 InitGenesis cross-validates each StorageTruthPostponements entry against supernode SuperNodeStatePostponed (genesis.go:103-115). Module init ordering confirmed: app/app_config.go:92 shows supernode → audit → action.
  • NEW-B-7 GetNextHealOpID panics on malformed length and on id==0 (storage_truth_state.go:179-188).
  • NEW-C-4 pruneStorageProofTranscripts now logs malformed records via k.Logger().Error (prune.go:253-262).
  • F119-F3 residue cross-holder PASS bonus (−3 additional ticket-deterioration delta) implemented in scoring path with predicate on prior failure-class + different-holder.
  • NEW-A-15, A-17 auto-closed by NEW-A-14 / NEW-A-12 fixes (verified — pattern bonuses unscaled at scoring.go:83-84; both window checks now use epochDelta).

Determinism scan (consensus paths)

Re-ran on x/audit/v1/keeper/:

  • float|math\.Pow|math\.Float|time\.Now|rand\.|sort\.Float|FormatFloat0 hits.
  • range\s+map\[0 hits.

Clean.

Verdict

LGTM from R2 perspective. Every blocker, MEDIUM, and LOW from review id 4184561676 is closed at the cited file:line. No new regressions detected on the consensus paths.

Final go/no-go is the human maintainer's call — this review is event=COMMENT per stack-review protocol.

— Zee

mateeullahmalik

This comment was marked as resolved.

Closes all 20 findings from Zee R3 on PR #117 review 4188900358
against post-R2 LEP-6-foundation tip 8748065:

HIGH 3/3:
- scan reporter pass stats across all result classes while honoring overturned failures
- validate strong-recovery clean-pass count against base recovery count
- extend retention/migration history to divergence and heal-deadline windows

MEDIUM 3/3:
- bound heal verifier count to 1..32
- tighten Class-A predicate so TIMEOUT on INDEX remains Class B/unscaled
- use no-op reward distribution mock so action tests exercise fee-routing path

LOW 14/14:
- strict postpone threshold ordering
- recover malformed heal-op counters without panic/reuse
- MaxUint64-safe epoch scan helpers for reporter results and transcripts
- genesis validation/import/export hardening for artifact counts, transcripts, and facts
- explicit failed-heal marker errors and propagation
- clean-epoch recovery for fresh reporters
- cross-holder PASS bonus coverage across non-hash prior failures
- strict artifact-count fallback with error on absent counts
- app-level module-order pinning
- migration-position closure in v1->v2

Implementation guide updated with R3 hardening notes and refreshed pre-release checklist items.

Verification performed before commit:
- go build ./...
- go test ./x/audit/v1/types ./x/audit/v1/keeper ./x/audit/v1/module ./x/action/v1/types ./x/action/v1/keeper ./app
- go test ./x/...
- go test ./tests/integration/action ./tests/integration/audit
- go test -tags=system ./tests/system/...
- go test -tags=system_test -timeout=1800s -v . from tests/systemtests
  ok github.com/LumeraProtocol/lumera/tests/systemtests 1031.464s
- git diff --check
Copy link
Copy Markdown
Contributor

@mateeullahmalik mateeullahmalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round-3 verification by Zee — PR #117 tip ef5991b (commit #128)

I re-checked all 20 R3 findings against the new tip (ef5991ba525ab7b498a9ff4b22d45e0aa9815a90, commit ef5991b fix(lep6): close Zee R3 foundation review findings). Verification by reading actual code at the cited file:line, not the commit message.

Result

20/20 R3 findings VERIFIED FIXED at the right file:line, with regression tests included.

HIGH (3/3 verified)

Finding Verified at
B-F1 §15.3 overturned-fail gate storage_truth_divergence.go:240-268 — single iterator over all classes; OverturnedByRecheck && isStorageTruthFailureClass(class) flips overturn; PASS class counted separately. New regression test storage_truth_overturn_gate_test.go (106 lines) asserts 5 PASS + 1 overturned HASH_MISMATCH does NOT receive −4. ✅
C-F1 StrongRecoveryCleanPassCount Validate gap params.go:781-787> 0 check + >= RecoveryCleanPassCount ordering invariant. New params_r3_validate_test.go (95 lines) covers both. ✅
C-F3 KeepLastEpochEntries lookback params.go::Validate() requiredHistory enumeration extended with StorageTruthDivergenceWindowEpochs and StorageTruthHealDeadlineEpochs. Migration handler module/migrations.go::NewMigrateV1ToV2 was extended in-place (v2 hasn't shipped to mainnet) to bump KeepLastEpochEntries to cover both windows; new migrations_test.go. ✅

MEDIUM (3/3 verified)

Finding Verified at
B-F2 TIMEOUT-on-INDEX Class-A misclassification Both call sites fixed: storage_truth_scoring.go:277-278 (history-fields) drops ArtifactClass==INDEX disjunct, isClassA only on HASH_MISMATCH || RECHECK_CONFIRMED_FAIL. :578 (trust-scaling predicate) similarly narrowed. Comment cites CP-R3 B-F2 explicitly. ✅
C-F2 HealVerifierCount upper bound params.go:791-793> 0 && <= 32. ✅
C-F4 testutil rewardDist mock testutil/keeper/action.go:368&MockRewardDistributionKeeper{Bps: 0} replaces the nil short-circuit, so all pre-existing action+audit tests now exercise the routing branch as no-op. ✅

LOW (14/14 verified)

  • A-F1 GetNextHealOpID graceful recovery (storage_truth_state.go:172-208) — three-way fallback (nil / wrong-len / id==0 → deriveNextHealOpID scans st/ho/ for max). Sibling-symmetric with GetNextEvidenceID. ✅
  • A-F2 / B-F3 ad-hoc range builders → named ScanRange helpers (keys.go:672-722) ReporterStorageTruthResultByTargetEpochScanRange and TranscriptByTargetBucketEpochScanRange, both ^uint64(0)-safe via prefixEnd(base) fallback. ✅
  • A-F3 ExportGenesis NextHealOpId == 0 branch is now reachable (consistent with the graceful A-F1 fix). ✅
  • A-F4 ValidateScoreStatesGenesis extended with WindowStartEpoch > currentEpoch checks AND a new TicketArtifactCountStates block (genesis_validate.go:36-44) rejecting empty ticket-id and (0,0) counts. ✅
  • A-F5 importStorageProofTranscriptForGenesis uses json.Decoder.DisallowUnknownFields() (storage_truth_fact_indexes.go:404-410). ✅
  • B-F4 setStorageTruthFailedHeal returns explicit error on empty supernode/ticket; both call sites (storage_truth_heal_ops.go:56-58 and msg_storage_truth.go:418-420) propagate. ✅ Subtle note below.
  • B-F5 ApplyReporterCleanEpochRecoveryAtEpochEnd now unions GetAllReporterReliabilityStates with the new storageTruthReporterAccountsForEpoch helper (divergence.go:163-178), then sorts the union for determinism — fresh reporters are no longer skipped. ✅
  • B-F6 storage_truth_cross_holder_pass_test.go:113-115 parameterises prior-class over TIMEOUT / OBSERVER_QUORUM_FAIL / INVALID_TRANSCRIPT. ✅
  • C-F5 Postpone < StrongPostpone strict (params.go:747-754, >= rejected). ✅
  • C-F6 CascadeArtifactCountsWithFallbackStrict returns explicit error when both counts and RqIdsIds are zero (x/action/v1/types/metadata.go:22-39); all 3 consensus-path callers in action.go:253 and action_cascade.go:150,329 use the strict variant. Non-strict variant preserved for backward compat. ✅
  • C-F7 new pin test app/lep6_module_order_test.go asserts supernode→audit→action across genesisModuleOrder, beginBlockers, endBlockers. ✅
  • C-F9 Migration handler extended in-place (module/migrations.go) since v2 has not shipped to mainnet — KeepLastEpochEntries = max(OldClassA, Divergence, HealDeadline). ✅

Determinism scan (consensus paths)

x/audit/v1/keeper/: 0 hits for float|math.Pow|math.Float|time.Now|rand.|sort.Float|FormatFloat; 0 hits for range\s+map\[. The two new map iterations introduced by R3 fixes (divergence.go:174 and :231 over reporterSet) are both followed by sort.Strings(reporters) — deterministic. ✅

One subtle note (B-F4 strict-error propagation)

The B-F4 fix changes setStorageTruthFailedHeal from a silent no-op to a hard error on empty supernodeAccount. Both call sites (expireStorageTruthHealOpsAtEpochEnd and finalizeHealOp) now propagate the error, which means the EndBlock returns an error if any non-final HealOp has empty HealerSupernodeAccount. The runtime scheduler at storage_truth_heal_ops.go:247-263 guarantees non-empty healer (it continues when assignment fails before constructing the HealOp), so this is a safe state machine invariant. However ImportGenesis at genesis.go:88 re-emits genState.HealOps via SetHealOp without validating HealerSupernodeAccount != "", so a malformed genesis with an empty-healer non-final HealOp would chain-halt on the first epoch end. This is governance-controlled (genesis only, no runtime path can produce the malformed state), but worth a defense-in-depth tweak: add a HealerSupernodeAccount != "" check inside the heal-op import loop in genesis.go:88-95 for non-final-status HealOps. Not blocking — flagging for a follow-up.

Tests run

The commit message claims:

  • go build ./... ✅ (assumed; not re-run here)
  • go test ./x/audit/v1/{types,keeper,module} ./x/action/v1/{types,keeper} ./app
  • go test ./x/...
  • go test ./tests/integration/{action,audit}
  • go test -tags=system ./tests/system/...
  • go test -tags=system_test -timeout=1800s -v . from tests/systemtests — passed in 1031s ✅

Verdict

LGTM from R3 perspective. Cleared for merge from my side. All 20 findings closed at the right file:line with corresponding regression tests, no determinism regressions, no new HIGH/MEDIUM issues introduced. The single LOW follow-up (genesis-import healer-account validation) is a defense-in-depth improvement, not a blocker.

Final go/no-go is the human maintainer's call.

— Zee

@j-rafique j-rafique requested a review from a-ok123 April 29, 2026 16:23
Copy link
Copy Markdown
Contributor

@mateeullahmalik mateeullahmalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FINAL Production-Gate Review by Zee — PR #117 tip ef5991b

Tip reviewed: ef5991ba525ab7b498a9ff4b22d45e0aa9815a90
Mandate: Final pre-merge gate. User wants explicit assurance no consensus problem / chain halt / determinism issue.

Methodology

Three parallel tracks, each focused exclusively on consensus-safety:

  • Track A: Determinism (float, time, rand, map iteration), chain-halt panic class, KV-iteration ordering, divisor/modulus on data, slice-bound safety. Full grep-based audit + line-by-line read of every consensus-path file in x/audit/v1/{keeper,types}.
  • Track B: Genesis InitGenesis/ExportGenesis round-trip determinism, migration safety, Params.Validate() completeness, MsgUpdateParams malleability surface.
  • Track C: Cross-module integration (action↔audit↔supernode), ABCI ordering, DistributeFees money flow, FinalizeAction→audit hook safety, IBC surface, OOG bounds.

R3 closure (review id 4191762658) was already verified in commit ef5991b. This round did not re-validate R3 — it hunted what we and the author missed.


Verdict

🟡 SAFE TO MERGE WITH PRE-MAINNET FOLLOW-UPS

No CRITICAL / chain-halt / in-block determinism issues found. The PR is consensus-safe for live chain operation.

However, 5 HIGH issues are open that you should track before any production-restart workflow or before significant cascade volume materializes. None blocks merge of the PR; all are safe to fix in follow-up PRs as long as the team is aware before mainnet.


What's verified clean ✅ (the consensus-critical surface)

Class Result
Float / Pow / Sqrt / FormatFloat / sort.Float 0 hits in consensus paths. The 5 float64 hits in enforcement.go::violatesMinFree/compliesMinFree and state.go:52 operate on HostReport.{Cpu,Mem,Disk}UsagePercent proto fields decoded via math.Float64frombits (bit-exact), used only for <, ==, <= and one 100.0 - x subtraction. Deterministic per IEEE-754 + Go spec. No sort.Float, no float-keyed map, no FormatFloat. ✅
time.Now / rand / crypto/rand 0 hits in scope. ✅
Map iteration (for x := range map[]) All 5 hits verified order-safe: either sorted before consumption (storage_truth_divergence.go:174,231sort.Strings), used as a min-by-deterministic-rank with lex tiebreak (audit_peer_assignment.go:156), or consumed only as membership test in error-only paths (msg_submit_epoch_report_storage_proofs.go:359,389). ✅
panic / Must… in consensus paths Only cdc.MustUnmarshal against bytes just written via cdc.Marshal — round-trip safe by construction. No MustNewDecFromStr, no raw panic(, no MustParse. ✅
binary.BigEndian.Uint64(bz) length checks All 13 call sites verified with explicit len(bz) != 8 or len(key) >= len(prefix)+8 guards. ✅
Division/modulo with data-driven divisor All 11 call sites have either explicit if x == 0 return guards or are reached only after Params.Validate() enforces > 0 on the param. decayTowardZero denominator is hardcoded 1000; factor bounded [1,1000] by Validate. scaleInt64TowardZero uses *big.Int to avoid overflow + early-returns on denominator <= 0. ✅
Slice indexing from data Every body[split+1:], key[len(prefix)+8:], obs.PortStates[i] is preceded by an explicit length check. ✅
EndBlock ordering app/app_config.go:156-176 gives supernode → audit → action across genesisModuleOrder, beginBlockers, endBlockers. Pinned by new app/lep6_module_order_test.go. Audit reads supernode active set after supernode EndBlock has settled. Action's DistributeFees runs only on tx path, not EndBlock. No circular hook dependency. ✅
DistributeFees money flow All Sub operations are bounded by their preceding Mul (bps ≤ 10000 enforced). No underflow. The ordering reward → foundation → per-SN is sequential against residual fee, not original — composition stays consistent even when shares conceptually exceed 100%. ✅
SetStorageTruthTicketArtifactCounts idempotent re-write returns nil; mismatch returns error. Caller FinalizeAction reverts atomically on error. No half-written state. ✅
storageTruthAssignmentHash (audit_peer_assignment.go:232) Single SHA-256 over `seed
Eligible challenger / healer / verifier pools All read from anchor.ActiveSupernodeAccounts (frozen at epoch start, sorted at write time) or sort.Strings(supernodeKeeper.GetAllSuperNodes(...)). Selection comparators use bytes.Compare on SHA-256 outputs with lex-tiebreak on bech32 strings (ASCII = byte compare). Empty pool returns empty slice, no panic. ✅
OOG per-tx MsgSubmitEpochReport.StorageProofResults capped at 16 (MaxStorageProofResultsPerReport validated in handler). Per-result inner loops are O(results × constant). Recheck-evidence handles one record per Msg. ✅
IBC surface No new packet types, ack handlers, or channel openings. No proto/ibc/* changes. New audit state lives under audit module's own KV prefixes; no IBC commitment-path implications. ✅
R3 fixes (20 of them) All confirmed still in place at this tip. Determinism scan still 0 hits. ✅

HIGH severity — must address before mainnet (NOT blocking this merge)

F-B1 (HIGH, MIGRATION/DETERMINISM): Strong-postpone band marker ap/sts/ is silently dropped on lumerad export → init

  • File:Line: proto/lumera/audit/v1/genesis.proto::StorageTruthPostponement has only {supernode_account, postponed_at_epoch_id} — no strong flag. x/audit/v1/keeper/genesis.go:103-115, 211 round-trips only those two fields.
  • Reproduction: chain runs, supernode crosses StorageTruthNodeSuspicionThresholdStrongPostpone (140), enforcement.go:168 writes ap/sts/<acct>=1. Operator runs lumerad export. New chain bootstraps from this genesis. hasStorageTruthStrongPostponeMarker returns false → recovery now requires RecoveryCleanPassCount (default 3) instead of StrongRecoveryCleanPassCount (default 5). Strong-postponed node recovers ~40% faster than spec.
  • Why not consensus-halt today: every validator restarting from the same exported genesis loses the same data, so the resulting chain is internally consistent — just semantically wrong vs. source chain. Live in-place upgrades (KV intact) are unaffected.
  • Fix: add bool strong_postpone to StorageTruthPostponement proto; export by reading ap/sts/ companion key; import via setStorageTruthStrongPostponeMarker.

F-B2 (HIGH, MIGRATION/DETERMINISM): Action-finalization postponement marker ap/af/ not in genesis at all

  • File:Line: x/audit/v1/keeper/action_finalization_postponement_state.go:10-30 — set/get/clear. No GetAllActionFinalizationPostponements exists; no field on GenesisState; no import loop in genesis.go. Consumer enforcement.go:251 (recovery routing).
  • Reproduction: AF-postponed supernode → export → import → marker absent → supernode evaluated for recovery via host/peer-port path, bypassing ActionFinalizationRecoveryEpochs and ActionFinalizationRecoveryMaxTotalBadEvidences. Recovery rules silently change.
  • Fix: add repeated ActionFinalizationPostponement to GenesisState; mirror the storage-truth pattern.

F-B3 (HIGH, MIGRATION/DETERMINISM): Evidence per-epoch counters eve/ not exported/imported

  • File:Line: x/audit/v1/keeper/evidence_epoch_count.go:10-26. Writer incrementEvidenceEpochCount is called only from evidence.go:87 (live). Consumers enforcement.go:356, 388-389 drive AF postponement and recovery. Zero Genesis I/O for the eve/ prefix.
  • Reproduction: existing chain has accumulated eve/<epoch>/<subject>/<type> counters used by shouldPostponeForActionFinalizationEvidence. Export → import → all counters drop to 0. Postponement gating becomes a no-op until counters re-accumulate; suspect supernodes recover prematurely.
  • Fix: either (a) reconstruct counters by walking imported Evidence slice in InitGenesis, or (b) round-trip via explicit repeated GenesisEvidenceEpochCount field.

F-B4 (HIGH, MIGRATION/DETERMINISM): HealOp verifications st/hov/ not exported/imported

  • File:Line: x/audit/v1/keeper/storage_truth_state.go:265-305 (set/get/getAll). genesis.HealOps round-trips the heal-op record itself but the verification sub-keys are dropped.
  • Reproduction: HealOps in IN_PROGRESS / VERIFYING with partial verifications. Export → import → all per-verifier votes vanish; quorum count restarts at 0. Heal-ops one verification away from VERIFIED reset to needing the full StorageTruthHealVerifierCount from scratch.
  • Fix: add repeated GenesisHealOpVerification {uint64 heal_op_id; string verifier; bool verified;} to GenesisState; export via GetAllHealOpVerifications per non-final heal-op.

F-C1 (HIGH, OOG): TicketDeteriorationState is iterated every epoch-end but never pruned — slow-burn EndBlock OOG halt risk

  • File:Line: x/audit/v1/keeper/storage_truth_heal_ops.go:105ticketStates, _ := k.GetAllTicketDeteriorationStates(ctx) runs unconditionally at every epoch-end. grep -rn "TicketDeteriorationState" x/audit/v1/keeper/ | grep Delete returns zero delete sites. prune.go::PruneOldEpochs only handles epoch-leading prefixes; st/td/ is keyed by ticket_id, not epoch. TicketArtifactCountState (st/tac/) has the same shape.
  • Reproduction: every cascade ticket experiencing a storage-proof event creates a TicketDeteriorationState row that is never deleted. Over chain lifetime, every epoch-end EndBlocker iterates and proto-unmarshals N rows, sorts, filters. An attacker can amplify by spamming cheap cascade actions (or naturally accumulates with mainnet usage). When per-block work exceeds consensus timeout or hits OOM on smaller validators, validators fork on liveness.
  • Why not caught by Validate / ante: there is no per-epoch cap on iteration. StorageTruthMaxSelfHealOpsPerEpoch only caps how many heal-ops are scheduled, not how many ticket states are scanned. The full prefix scan runs unconditionally before the cap is applied.
  • Why not consensus-halt today: cascade volume on devnet/testnet is low. Becomes a real concern at mainnet scale or under deliberate amplification.
  • Fix: prune st/td/ rows where LastUpdatedEpoch + KeepLastEpochEntries < currentEpoch && DeteriorationScore == 0 && ActiveHealOpId == 0 from PruneOldEpochs. Same treatment for st/tac/ once finalize-state is settled. Alternative: maintain a (score_bucket || ticket_id) secondary index and only iterate the high-score band.

MEDIUM severity (track but not blocking)

F-B5 (MEDIUM, MIGRATION/MALLEABILITY): NewMigrateV1ToV2 writes params back without re-running Validate()

  • File:Line: x/audit/v1/module/migrations.go:14-41; keeper.SetParams does NOT call Validate.
  • Reproduction: a v1 chain with Postpone=200, StrongPostpone=0 → migration runs WithDefaults (StrongPostpone=140) → SetParams writes the now-invalid Postpone(200) >= StrongPostpone(140) combo. Runtime never re-validates, the strong band collapses.
  • Fix: call params.Validate() at the end of the migration handler (and ideally inside keeper.SetParams).

LOW severity (informational / defense-in-depth)

# Class Headline
F-A1 DETERMINISM (theoretical) decayTowardZero int64 multiplication wraps if score ever reached MaxInt64 (deltas ≤ 26 per result + saturation makes this unreachable in practice)
F-A2 DETERMINISM (informational) float64 HostReport usage fields — pure comparisons, bit-exact, deterministic, but flagged for policy. Could be migrated to uint32
F-B6 MALLEABILITY MaxProbeTargetsPerEpoch == 0 not rejected by Validate (relies on WithDefaults); JSON genesis with explicit zero would silently disable probe coverage
F-B7 MALLEABILITY KeepLastEpochEntries and window params have no upper bound; MaxUint64 accepted — pruning becomes a no-op forever
F-B8 MALLEABILITY ScChallengersPerEpoch not bounded; sentinel-zero overload undocumented
F-B9 INFO ConsensusVersion=2 + RegisterMigration(1, ...) + app/upgrades/v1_11_1 audit pinning verified consistent. v1.12.0 doesn't touch audit
F-B10 MIGRATION importStorageProofTranscriptForGenesis uses DisallowUnknownFields — strict but means cross-version genesis imports break. Other fact-index imports use raw-bytes pattern; recommend mirroring
F-B11/12 INFO No duplicate-ID assertions for EvidenceId / HealOpId in genesis validator
F-C2 MALLEABILITY RegistrationFeeShareBps + FoundationFeeShare > 100% not validated cross-module — sequential subtraction stays consistent, no underflow, cosmetic only
F-C3 MALLEABILITY validateDecString accepts negative LegacyDec → governance can set (superNode=2.0, foundation=-1.0)sdk.NewCoin panics at FinalizeAction → per-tx revert (NOT halt). Add non-negative guard

Pre-mainnet checklist

  1. Address F-C1 first (state-bloat OOG) — schedule pruning for st/td/ and st/tac/ BEFORE mainnet sees significant cascade volume. This is the only HIGH that can manifest on a live chain over time.
  2. Address F-B1 through F-B4 before any export-import workflow is exercised in production — testnet hand-offs, disaster-recovery genesis-export, fork-from-state operations all hit these paths. They're mechanically straightforward (proto field add + GetAll/Set helper + InitGenesis loop).
  3. Add Validate() to keeper.SetParams (closes F-B5).
  4. Tighten Validate for F-B6/B7/B8/C3 — cheap defensive fixes against future gov mistakes.

Summary

Track CRITICAL HIGH MEDIUM LOW
A — Determinism / chain-halt / KV iteration 0 0 0 2
B — Genesis / migration / param malleability 0 4 1 7
C — Cross-module / EndBlock / DistributeFees 0 1 0 2
Total 0 5 1 11

No consensus-affecting issues at runtime. No deterministic chain-halt at runtime. No determinism issues at runtime. The 5 HIGH findings are all latent issues that materialize either (a) on lumerad export | init-from-genesis workflows (F-B1..B4) or (b) under sustained cascade volume / chain age (F-C1). Address them before mainnet but they do not block this merge.

Posted as event=COMMENT. Final go/no-go is the human maintainer's call.

— Zee

Accept valid partial storage challenge observations instead of rejecting the full epoch report, then apply a once-per-epoch reporter reliability penalty with reason INCOMPLETE_REPORT. Keep invalid observations rejected and avoid immediate postponement.
@mateeullahmalik mateeullahmalik merged commit 451f8a8 into master Apr 30, 2026
15 checks passed
akobrin1 added a commit that referenced this pull request May 1, 2026
Brings master's LEP-6 audit storage-truth scaffolding (#117), recent devnet
test fixes (#123, #124), and scoped lint cleanup (#125) onto the evm branch.

Conflict resolutions:
- .gitignore: union of both sides, deduped, fixed .codex/.agents (file vs dir)
- devnet/go.mod: master's lumera-only replace, kept evm's local sdk-go replace
  (devnet evmigration tests use BuildAndSignTxWithOptions only on local sdk-go);
  cosmos-sdk pinned to evm's v0.53.6.
- supernode-setup.sh: folded master's EVERLIGHT_TEST_TARGET host_reporter
  bypass into evm's start_supernode_process abstraction so all call sites
  honor the env var.
- audit_peer_ports / audit_recovery system tests: kept evm's wider epoch
  windows + multi-node assignment-aware reporting (the EVM-integrated binary
  is slower; tighter epochs flake).
- distribution_freshness / query_get_reward_eligibility tests: kept evm's
  Bech32-prefixed config constants (master's lcfg.ValidatorAddressPrefix
  doesn't compile on evm).
- metrics_state.go / metrics_validation.go: kept evm's reserved
  STORAGE_FULL helpers (markStorageFull, recoverFromStorageFull,
  lastNonDegradedState) for follow-up audit-enforcement wiring.
- testutil/cryptotestutils → testutil/crypto: rewrote 3 master-side import
  paths to evm's renamed directory.
- Dropped master's duplicate indexOfModule from lep6_module_order_test.go
  in favor of the existing helper in app/evm_test.go.

Lint cleanup applied to bring `make lint` to 0 issues:
- defer it.Close() → defer func() { _ = it.Close() }() across audit module.
- gofmt -w on master-side test/sim files.
- Removed unused //nolint:staticcheck directives on HasInvariants.
- Marked reserved declarations (maxAuditEpochLookback, globalAuditKeeper)
  with //nolint:unused.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants