Postgres: installed os-apps wiped on pod restart — Phase 8b only runs for Turso store

## Summary

When Temper runs with `--storage postgres`, previously-installed OS apps are lost on process restart. The SpecRegistry repopulates platform + Agent specs at boot, but **there is no code path that replays tenant os-app installs from Postgres** — the Phase 8b recovery exists only for the Turso store.

Observable effect: after `POST /observe/os-apps/<app>/install`, the entity sets (e.g. `/tdata/GitTokens`) work until the next pod restart. After restart, `{"error":{"code":"EntitySetNotFound","message":"Entity set 'GitTokens' not found"}}` is returned for anything the user app defined.

## Environment

- Temper HEAD `078777d2aea5`.
- Storage: `--storage postgres` (Cloud SQL Postgres 16).
- Deployment: k8s Deployment, 2 replicas.
- Repro was against the `dark-helix` OS app with 21 entity types.

## Root cause

Two gaps working together:

1. **`temper-store-postgres` has no installed-apps tracking.** Trait methods on `PlatformStore`:

   - `is_app_installed(tenant, app)`
   - `record_installed_app(tenant, app)`
   - `list_all_installed_apps()`

   are implemented in `temper-store-turso::store::specs` (see
   `crates/temper-store-turso/src/store/specs.rs:246`) but have **no equivalent** in `crates/temper-store-postgres/`. Grepping the crate returns zero hits for `list_all_installed_apps` / `record_installed_app` / `is_app_installed`.

2. **Phase 8b only wires the Turso path.** In `crates/temper-cli/src/serve/bootstrap.rs::bootstrap_installed_apps` (~line 564):

   ```rust
   if let Some(ref store) = state.server.event_store
       && let Some(turso) = store.platform_turso_store()
   {
       match turso.list_all_installed_apps().await { ... }
   }
   ```

   With a Postgres-only deployment `platform_turso_store()` returns `None`, so the whole replay block is skipped. There is no fallback to `PostgresStore::list_all_installed_apps()` (which doesn't exist) or to a file-catalog scan.

   The earlier comment at line 517 of the same file — *"OS app specs are already restored from the `specs` table by `restore_registry_from_turso` (Phase 2) and Cedar policies by `recover_cedar_policies` (Phase 6), so no reinstall loop is needed"* — explicitly depends on the Turso restore path, which doesn't exist for Postgres.

## Reproduce

1. Run Temper with `--storage postgres` against a fresh Postgres database.
2. Ship an OS app bundle into `TEMPER_OS_APPS_DIR` (e.g. an initContainer extracting a tarball to `/apps/dark-helix`).
3. `POST /observe/os-apps/dark-helix/install` with `{"tenant": "dark-helix"}`. Returns 200 with `added: [...entities]`.
4. `GET /tdata/<any-entity-from-the-bundle>` with `X-Tenant-Id: dark-helix` → 200 OK.
5. Delete the pod (`kubectl delete pod -l app=temper`). Wait for the new pod.
6. Re-issue the same `GET` → 404 `EntitySetNotFound`.

## Suggested fix

Two orthogonal layers are worth fixing — a short-term unblocker and a long-term proper solution:

### Short-term: add a filesystem-catalog reinstall on startup

Gate behind an env var or a CLI flag so existing Turso-based users aren't affected. Pseudocode:

```rust
// Phase 8b: re-install apps found on the local catalog into TEMPER_TENANT.
if std::env::var("TEMPER_AUTO_INSTALL_APPS").ok().as_deref() == Some("true") {
    let tenant = std::env::var("TEMPER_TENANT").unwrap_or_else(|_| "default".into());
    for entry in os_apps::list_os_apps() {
        if let Err(e) = os_apps::install_os_app(state, &tenant, &entry.name).await {
            tracing::warn!("auto-install failed for app='{}': {e}", entry.name);
        }
    }
}
```

This sidesteps the Postgres-tracking gap entirely. It's also the right behavior for deployments where the app catalog is shipped via image/initContainer/PVC (i.e. the canonical k8s pattern) — the filesystem IS the source of truth, so no DB tracking is needed.

### Long-term: Postgres implementations of the three trait methods

Add a `tenant_installed_apps` table in `temper-store-postgres::schema`, implement the three PlatformStore trait methods, and remove the Turso-only gate in `bootstrap_installed_apps`. This is the symmetric solution but requires a migration.

## Context

Hit this bringing up Temper on GKE as the control plane for the dark-helix factory. Currently working around by (a) scaling the Temper Deployment to 1 replica and (b) re-running the install after every pod restart. Neither is acceptable for a production control plane. Planning to implement the short-term fix on a local fix branch and build from that while the upstream work lands.

Related: nerdsane/temper#148 (tenant_secrets migration — another Postgres-specific init-path bug from the same pipeline).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Postgres: installed os-apps wiped on pod restart — Phase 8b only runs for Turso store #150

Summary

Environment

Root cause

Reproduce

Suggested fix

Short-term: add a filesystem-catalog reinstall on startup

Long-term: Postgres implementations of the three trait methods

Context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Postgres: installed os-apps wiped on pod restart — Phase 8b only runs for Turso store #150

Description

Summary

Environment

Root cause

Reproduce

Suggested fix

Short-term: add a filesystem-catalog reinstall on startup

Long-term: Postgres implementations of the three trait methods

Context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions