Skip to content

test(integration): LCM+SM live-fire against real SQLite store#49

Merged
Pritom14 merged 4 commits into
mainfrom
session/aa-31
May 31, 2026
Merged

test(integration): LCM+SM live-fire against real SQLite store#49
Pritom14 merged 4 commits into
mainfrom
session/aa-31

Conversation

@harshitsinghbhandari
Copy link
Copy Markdown
Collaborator

Summary

Live-fire integration verification of the lifecycle + session lane against the SQLite store + CDC pipeline merged in PR #37. Until now every LCM/SM test ran against the in-memory fake in lifecycle/fakes_test.go — this PR adds a separate integration package that hydrates the real components in one process and asserts via direct store reads.

  • backend/internal/integration/lifecycle_sqlite_test.go — five end-to-end tests.
  • backend/internal/storage/sqlite/db.go — one minimal fix for a real wiring gap surfaced by -race.

All 219 tests pass under -race across 18 packages. The fakes-based unit tests are untouched.

What the tests cover

  • TestHappyPath_Spawn_PR_KillSM.Spawn → SCM PR observation (open + CI passing) → SM.Kill. Asserts the canonical session row (state/is_alive/termination_reason), the pr table row written atomically in one tx, and that the change_log captured session_created, session_updated, pr_created, pr_check_recorded event_types.
  • TestRestoreRoundTrip_PreservesMetadata — spawn, kill, close store, reopen the same DB path, hydrate a fresh LCM/SM, SM.Restore. Asserts Metadata.AgentSessionID and the rest of SessionMetadata survive a full daemon restart and flow back through OnSpawnCompleted.
  • TestCIFailureAndRecovery_NudgeThenClears — failing CI observation fires rxCIFailed through the messenger with the log tail injected; passing CI observation switches the human notifier to reaction.approved-and-green. Asserts pr_checks history is the brake's source of truth.
  • TestDetectingPersistsAcrossRestart — failed probe parks the session in detecting; detecting_attempts / detecting_started_at / detecting_evidence_hash round-trip across a close/reopen; alive probe clears the quarantine.
  • TestCDCPollerReceivesAllStages — drives cdc.Poller over the real change_log; asserts each expected event_type is delivered and seq is monotonic across the stream.

Wiring gap fixed (minimal, called out)

backend/internal/storage/sqlite/db.go (~11 lines) — goose v3 keeps baseFS, logger, and dialect as package-level globals. migrate() mutates them, so two concurrent Open() calls race on goose.SetBaseFS / SetLogger / SetDialect. Latent in production (one Open per process), real under -race once tests use t.Parallel(). Fix: a process-wide sync.Mutex scoped only to migrate() — runtime read/write paths are untouched.

Scope notes (read me)

The original task brief and aa-27's compat report describe a fancier architecture than what actually shipped in PR #37 on main. I am being explicit about what I did and didn't cover:

Brief item Status Reason
Real SQLite store + LCM + SM live-fire Done across all 5 tests.
Spawn → OnSpawn → SCM PR open + CI pass → kill happy path TestHappyPath_Spawn_PR_Kill.
Direct store reads for session metadata + change_log event_types TestHappyPath_* + TestCDCPollerReceivesAllStages.
Restore round-trip across close/reopen TestRestoreRoundTrip_PreservesMetadata.
CI-failed path → detecting_* fields persisted Reframed detecting_* columns are about runtime quarantine, not CI — separate axis. Runtime persistence covered by TestDetectingPersistsAcrossRestart; CI nudge + recovery covered by TestCIFailureAndRecovery_NudgeThenClears.
Janitor / consumer_offset / outbox watermark ⛔ Skipped None exist on main. change_log is the only durable CDC table (see cdc/event.go docstring); clients catch up by reading it from their own offset (SSE Last-Event-ID). There is no janitor and no outbox.
reaction_trackers table populated then cleared ⛔ Skipped No reaction_trackers table on main; lifecycle/reactions.go:116-120 keeps trackers in-memory and docs them as policy-not-canonical. The reaction-clearing behaviour is exercised at the messenger/notifier observable in TestCIFailureAndRecovery.
revision monotonicity / Upsert(rec, eventType) signature ⛔ N/A No revision column on sessions (migrations/0001_init.sql:24-61); LifecycleStore is UpdateSession(ctx, rec), not Upsert(ctx, rec, eventType). Ordering is via change_log.seq, asserted in TestCDCPollerReceivesAllStages.

Constraints honored

  • ports.LifecycleStore signature, domain types, and migrations are untouched.
  • storage/sqlite/* modified only for the goose race (one targeted bug fix, ~11 lines).
  • lifecycle/fakes_test.go and all existing unit tests untouched.
  • -race clean across the whole repo.

Test plan

  • go build ./...
  • go vet ./...
  • go test ./... (214 → 219, all pass)
  • go test -race ./... (clean)
  • go test -race -count=1 ./internal/integration/... (5 pass)

🤖 Generated with Claude Code

Adds backend/internal/integration with five end-to-end tests that hydrate the
real lifecycle.Manager + session.Manager against a tmp SQLite store and
exercise the full pipeline through the DB triggers and the CDC poller:

- TestHappyPath_Spawn_PR_Kill — spawn -> SCM PR observation (open + CI
  passing) -> kill; asserts canonical row, pr row, and change_log event
  types (session_created/_updated, pr_created, pr_check_recorded).
- TestRestoreRoundTrip_PreservesMetadata — spawn, kill, close store, reopen
  same DB path, hydrate fresh LCM/SM, Restore(); asserts AgentSessionID and
  the rest of SessionMetadata survive across the daemon restart.
- TestCIFailureAndRecovery_NudgeThenClears — failing CI observation drives
  the CI-failed reaction nudge with the log tail injected; passing CI
  observation switches to approved-and-green human notify; pr_checks history
  reads back the failure (the brake's source of truth).
- TestDetectingPersistsAcrossRestart — failed probe parks the session in
  detecting with detecting_* columns populated, round-trips across a
  close/reopen, alive probe clears the quarantine memory.
- TestCDCPollerReceivesAllStages — drives the real cdc.Poller; asserts the
  trigger pipeline emits each expected event_type and seq is monotonic.

Wiring gap fixed (minimal): goose v3 keeps baseFS/logger/dialect as
package-level globals, so two concurrent sqlite.Open() calls — uncommon in
production but normal under -race with t.Parallel() — race on
goose.SetBaseFS/SetLogger/SetDialect inside migrate(). Added a process-level
sync.Mutex around the migrate() call. ~11 lines, no signature changes.

Scope notes (the task brief assumed a fancier architecture than what
actually shipped in PR #37):
- No outbox / consumer_offsets / janitor exist on main — the change_log
  table IS the durable, ordered source of truth (see cdc/event.go), so the
  brief's janitor-watermark step is skipped.
- No reaction_trackers table / ReactionStore port — trackers are in-memory
  per lifecycle/reactions.go; persistence-round-trip there is N/A.
- No revision column / Upsert(rec, eventType) — write-mutex serialises and
  change_log.seq orders, so the assertions land on event_type + seq, not on
  a per-row revision counter.

All 219 tests pass under -race across 18 packages. lifecycle/fakes_test.go
is untouched; existing unit tests still drive the in-memory fake.
The Check formatting step failed on the stub method alignment in
stubAgent's three method declarations. Re-run of gofmt aligns the column
gutter on those signatures; no behavioural change.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 31, 2026

Greptile Summary

Adds five end-to-end integration tests that exercise the lifecycle + session managers against a real SQLite store and CDC trigger pipeline, and fixes a data race in db.go where concurrent Open() calls could race on goose's package-level globals during parallel test runs.

  • lifecycle_sqlite_test.go — five live-fire tests (happy-path spawn→PR→kill, restore round-trip across close/reopen, CI failure + recovery, detecting-state persistence across restart, CDC poller delivery), each getting its own t.TempDir() database; the liveStack harness uses an idempotent t.Cleanup backstop so mid-test t.Fatalf can't leak the SQLite handle.
  • db.go — adds gooseMu sync.Mutex scoped only to migrate(), serialising the three goose package-level writes (SetBaseFS, SetLogger, SetDialect) without touching any runtime read/write paths.

Confidence Score: 5/5

Safe to merge — the goose mutex fix is a one-liner targeting a real but latent race, and the new test package is isolated with no production-path changes.

Both changed files are low-risk: db.go adds an eleven-line, well-scoped mutex that only guards migration setup and has no effect on the runtime read/write paths; the integration test file introduces no production code. The single finding is a diagnostic quality issue in test assertions that would produce a misleading failure message if a GetSession call happened to fail post-kill, but does not affect the correctness of the behaviour under test.

No files require special attention beyond the inline suggestion on lifecycle_sqlite_test.go.

Important Files Changed

Filename Overview
backend/internal/storage/sqlite/db.go Adds a package-level gooseMu sync.Mutex that serialises all migrate() calls so concurrent Open() invocations in parallel test runs no longer race on goose's package-level globals (SetBaseFS, SetLogger, SetDialect). Minimal and targeted — runtime read/write paths are untouched.
backend/internal/integration/lifecycle_sqlite_test.go New 689-line integration test package covering 5 live-fire scenarios against the real SQLite store: happy-path spawn→PR→kill, restore round-trip, CI failure/recovery, detecting-state persistence across restart, and CDC poller delivery. The liveStack harness correctly uses an idempotent t.Cleanup backstop alongside explicit close() calls. One GetSession result in TestHappyPath_Spawn_PR_Kill silently discards ok and err, which could obscure the true failure cause if the read fails post-kill.

Sequence Diagram

sequenceDiagram
    participant T as Test
    participant SM as session.Manager
    participant LCM as lifecycle.Manager
    participant Store as sqlite.Store
    participant CL as change_log (triggers)
    participant Poller as cdc.Poller

    T->>SM: Spawn(SpawnConfig)
    SM->>Store: CreateSession()
    Store-->>CL: trigger → session_created
    SM->>LCM: OnSpawnCompleted()
    LCM->>Store: UpdateSession()
    Store-->>CL: trigger → session_updated
    SM-->>T: "Session{ID}"

    T->>LCM: ApplyPRObservation()
    LCM->>Store: WritePRObservation() [tx]
    Store-->>CL: trigger → pr_created
    Store-->>CL: trigger → pr_check_recorded

    T->>SM: Kill(TermManuallyKilled)
    SM->>LCM: OnKillRequested()
    LCM->>Store: UpdateSession()
    Store-->>CL: trigger → session_updated

    T->>Poller: Poll(ctx)
    Poller->>Store: ReadChangeLogAfter(lastSeq)
    Store-->>Poller: []Event (seq-ordered)
    Poller->>T: Broadcast events (monotonic seq)
Loading

Reviews (3): Last reviewed commit: "test(integration): check ok/err before p..." | Re-trigger Greptile

Comment thread backend/internal/integration/lifecycle_sqlite_test.go
Comment thread backend/internal/integration/lifecycle_sqlite_test.go Outdated
Comment thread backend/internal/integration/lifecycle_sqlite_test.go
- Idempotent close + t.Cleanup so a mid-test t.Fatalf can't leak the
  SQLite handle for the rest of the binary run. Restart-style tests still
  call close() explicitly between phases; the cleanup hook becomes a no-op
  once that runs.
- Drop the hardcoded "mer-1" assertion in TestHappyPath; assert the
  structural invariant (project-scoped, non-empty id) instead, so the
  test does not couple to the {project}-{counter} generation detail.
- Realign the test storeAdapter's PRFactsForSession + WritePR bodies to
  be line-for-line identical to backend/lifecycle_wiring.go's production
  adapter (extracted prState helper, separated err vs empty-rows checks),
  so a future divergence shows up as a diff at review time. The proper
  fix (extract to internal/storeutil) remains out of scope per the
  brief's "do NOT redesign anything".

All 219 tests still pass under -race.
Comment thread backend/internal/integration/lifecycle_sqlite_test.go Outdated
Greptile P1: the patch-then-Update path in TestRestoreRoundTrip was
discarding GetSession's ok/err. A missed row would have handed UpdateSession
a zero-value SessionRecord (ID==""), which matches zero rows and returns
nil — Phase B then fails with the misleading "agent session id lost across
restart" instead of the real cause.

Fixed at the patch site (the only write path that could swallow the error)
and at the two read-then-assert sites in TestDetectingPersistsAcrossRestart
for consistency. Downstream assertions there would already fail loudly, but
the explicit ok/err check makes the failure mode unambiguous.

All 219 tests still pass under -race.
@Pritom14 Pritom14 merged commit 438b830 into main May 31, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants