test(integration): LCM+SM live-fire against real SQLite store#49
Conversation
Adds backend/internal/integration with five end-to-end tests that hydrate the real lifecycle.Manager + session.Manager against a tmp SQLite store and exercise the full pipeline through the DB triggers and the CDC poller: - TestHappyPath_Spawn_PR_Kill — spawn -> SCM PR observation (open + CI passing) -> kill; asserts canonical row, pr row, and change_log event types (session_created/_updated, pr_created, pr_check_recorded). - TestRestoreRoundTrip_PreservesMetadata — spawn, kill, close store, reopen same DB path, hydrate fresh LCM/SM, Restore(); asserts AgentSessionID and the rest of SessionMetadata survive across the daemon restart. - TestCIFailureAndRecovery_NudgeThenClears — failing CI observation drives the CI-failed reaction nudge with the log tail injected; passing CI observation switches to approved-and-green human notify; pr_checks history reads back the failure (the brake's source of truth). - TestDetectingPersistsAcrossRestart — failed probe parks the session in detecting with detecting_* columns populated, round-trips across a close/reopen, alive probe clears the quarantine memory. - TestCDCPollerReceivesAllStages — drives the real cdc.Poller; asserts the trigger pipeline emits each expected event_type and seq is monotonic. Wiring gap fixed (minimal): goose v3 keeps baseFS/logger/dialect as package-level globals, so two concurrent sqlite.Open() calls — uncommon in production but normal under -race with t.Parallel() — race on goose.SetBaseFS/SetLogger/SetDialect inside migrate(). Added a process-level sync.Mutex around the migrate() call. ~11 lines, no signature changes. Scope notes (the task brief assumed a fancier architecture than what actually shipped in PR #37): - No outbox / consumer_offsets / janitor exist on main — the change_log table IS the durable, ordered source of truth (see cdc/event.go), so the brief's janitor-watermark step is skipped. - No reaction_trackers table / ReactionStore port — trackers are in-memory per lifecycle/reactions.go; persistence-round-trip there is N/A. - No revision column / Upsert(rec, eventType) — write-mutex serialises and change_log.seq orders, so the assertions land on event_type + seq, not on a per-row revision counter. All 219 tests pass under -race across 18 packages. lifecycle/fakes_test.go is untouched; existing unit tests still drive the in-memory fake.
The Check formatting step failed on the stub method alignment in stubAgent's three method declarations. Re-run of gofmt aligns the column gutter on those signatures; no behavioural change.
Greptile SummaryAdds five end-to-end integration tests that exercise the lifecycle + session managers against a real SQLite store and CDC trigger pipeline, and fixes a data race in
Confidence Score: 5/5Safe to merge — the goose mutex fix is a one-liner targeting a real but latent race, and the new test package is isolated with no production-path changes. Both changed files are low-risk: No files require special attention beyond the inline suggestion on Important Files Changed
Sequence DiagramsequenceDiagram
participant T as Test
participant SM as session.Manager
participant LCM as lifecycle.Manager
participant Store as sqlite.Store
participant CL as change_log (triggers)
participant Poller as cdc.Poller
T->>SM: Spawn(SpawnConfig)
SM->>Store: CreateSession()
Store-->>CL: trigger → session_created
SM->>LCM: OnSpawnCompleted()
LCM->>Store: UpdateSession()
Store-->>CL: trigger → session_updated
SM-->>T: "Session{ID}"
T->>LCM: ApplyPRObservation()
LCM->>Store: WritePRObservation() [tx]
Store-->>CL: trigger → pr_created
Store-->>CL: trigger → pr_check_recorded
T->>SM: Kill(TermManuallyKilled)
SM->>LCM: OnKillRequested()
LCM->>Store: UpdateSession()
Store-->>CL: trigger → session_updated
T->>Poller: Poll(ctx)
Poller->>Store: ReadChangeLogAfter(lastSeq)
Store-->>Poller: []Event (seq-ordered)
Poller->>T: Broadcast events (monotonic seq)
Reviews (3): Last reviewed commit: "test(integration): check ok/err before p..." | Re-trigger Greptile |
- Idempotent close + t.Cleanup so a mid-test t.Fatalf can't leak the
SQLite handle for the rest of the binary run. Restart-style tests still
call close() explicitly between phases; the cleanup hook becomes a no-op
once that runs.
- Drop the hardcoded "mer-1" assertion in TestHappyPath; assert the
structural invariant (project-scoped, non-empty id) instead, so the
test does not couple to the {project}-{counter} generation detail.
- Realign the test storeAdapter's PRFactsForSession + WritePR bodies to
be line-for-line identical to backend/lifecycle_wiring.go's production
adapter (extracted prState helper, separated err vs empty-rows checks),
so a future divergence shows up as a diff at review time. The proper
fix (extract to internal/storeutil) remains out of scope per the
brief's "do NOT redesign anything".
All 219 tests still pass under -race.
Greptile P1: the patch-then-Update path in TestRestoreRoundTrip was discarding GetSession's ok/err. A missed row would have handed UpdateSession a zero-value SessionRecord (ID==""), which matches zero rows and returns nil — Phase B then fails with the misleading "agent session id lost across restart" instead of the real cause. Fixed at the patch site (the only write path that could swallow the error) and at the two read-then-assert sites in TestDetectingPersistsAcrossRestart for consistency. Downstream assertions there would already fail loudly, but the explicit ok/err check makes the failure mode unambiguous. All 219 tests still pass under -race.
Summary
Live-fire integration verification of the lifecycle + session lane against the SQLite store + CDC pipeline merged in PR #37. Until now every LCM/SM test ran against the in-memory fake in
lifecycle/fakes_test.go— this PR adds a separate integration package that hydrates the real components in one process and asserts via direct store reads.backend/internal/integration/lifecycle_sqlite_test.go— five end-to-end tests.backend/internal/storage/sqlite/db.go— one minimal fix for a real wiring gap surfaced by-race.All 219 tests pass under
-raceacross 18 packages. The fakes-based unit tests are untouched.What the tests cover
TestHappyPath_Spawn_PR_Kill—SM.Spawn→ SCM PR observation (open + CI passing) →SM.Kill. Asserts the canonical session row (state/is_alive/termination_reason), theprtable row written atomically in one tx, and that the change_log capturedsession_created,session_updated,pr_created,pr_check_recordedevent_types.TestRestoreRoundTrip_PreservesMetadata— spawn, kill, close store, reopen the same DB path, hydrate a fresh LCM/SM,SM.Restore. AssertsMetadata.AgentSessionIDand the rest ofSessionMetadatasurvive a full daemon restart and flow back throughOnSpawnCompleted.TestCIFailureAndRecovery_NudgeThenClears— failing CI observation firesrxCIFailedthrough the messenger with the log tail injected; passing CI observation switches the human notifier toreaction.approved-and-green. Assertspr_checkshistory is the brake's source of truth.TestDetectingPersistsAcrossRestart— failed probe parks the session indetecting;detecting_attempts/detecting_started_at/detecting_evidence_hashround-trip across a close/reopen; alive probe clears the quarantine.TestCDCPollerReceivesAllStages— drivescdc.Pollerover the realchange_log; asserts each expected event_type is delivered andseqis monotonic across the stream.Wiring gap fixed (minimal, called out)
backend/internal/storage/sqlite/db.go(~11 lines) —goose v3keepsbaseFS,logger, anddialectas package-level globals.migrate()mutates them, so two concurrentOpen()calls race ongoose.SetBaseFS/SetLogger/SetDialect. Latent in production (oneOpenper process), real under-raceonce tests uset.Parallel(). Fix: a process-widesync.Mutexscoped only tomigrate()— runtime read/write paths are untouched.Scope notes (read me)
The original task brief and aa-27's compat report describe a fancier architecture than what actually shipped in PR #37 on
main. I am being explicit about what I did and didn't cover:TestHappyPath_Spawn_PR_Kill.TestHappyPath_*+TestCDCPollerReceivesAllStages.TestRestoreRoundTrip_PreservesMetadata.detecting_*columns are about runtime quarantine, not CI — separate axis. Runtime persistence covered byTestDetectingPersistsAcrossRestart; CI nudge + recovery covered byTestCIFailureAndRecovery_NudgeThenClears.consumer_offset/ outbox watermarkchange_logis the only durable CDC table (seecdc/event.godocstring); clients catch up by reading it from their own offset (SSE Last-Event-ID). There is no janitor and no outbox.reaction_trackerstable populated then clearedreaction_trackerstable on main;lifecycle/reactions.go:116-120keeps trackers in-memory and docs them as policy-not-canonical. The reaction-clearing behaviour is exercised at the messenger/notifier observable inTestCIFailureAndRecovery.revisionmonotonicity /Upsert(rec, eventType)signaturerevisioncolumn onsessions(migrations/0001_init.sql:24-61);LifecycleStoreisUpdateSession(ctx, rec), notUpsert(ctx, rec, eventType). Ordering is viachange_log.seq, asserted inTestCDCPollerReceivesAllStages.Constraints honored
ports.LifecycleStoresignature, domain types, and migrations are untouched.storage/sqlite/*modified only for the goose race (one targeted bug fix, ~11 lines).lifecycle/fakes_test.goand all existing unit tests untouched.-raceclean across the whole repo.Test plan
go build ./...go vet ./...go test ./...(214 → 219, all pass)go test -race ./...(clean)go test -race -count=1 ./internal/integration/...(5 pass)🤖 Generated with Claude Code