feat(persistence,rest,hfs): FHIR Bulk Data Export ($export) — async kick-off, postgres-s3 multi-instance, Inferno v2.0.0 by aacruzgon · Pull Request #108 · HeliosSoftware/hfs

aacruzgon · 2026-05-15T17:00:27Z

FHIR Bulk Data Export

Implements the FHIR Bulk Data Access IG $export family (system / patient / group) end-to-end, per Discussion #104. Embedded single-instance (SQLite job state + local-FS output + in-process worker pool) is the zero-config default; a multi-instance topology (PostgreSQL job state + S3-compatible output with pre-signed download URLs) is selected at startup with no handler changes. Ships with an external smoke workflow that exercises both topologies on every run and an Inferno Bulk Data IG v2.0.0 conformance workflow against the full SMART Backend Services + Keycloak stack.

Why

Bulk export is the API population-health platforms, payer-provider exchanges, registries, and research/AI pipelines converge on. CRUD + search are not enough once a workload needs every Observation for every patient in a cohort — that's a data-engineering problem (long-running work, durable state, fileserver bandwidth, multi-instance fan-out) rather than a request/response one. The IG defines an asynchronous, manifest-based, NDJSON-over-HTTPS pattern; this PR ships it as a first-class HFS subsystem.

Changes

Persistence — new traits + types (helios-persistence)

core/bulk_export.rs: extended ExportRequest (until / elements / include_associated_data / patient_refs); extended ExportManifest (deleted / link); new StartExportInput, RawExportManifest, RawManifestEntry, ExportJobMetadata, ExportFileMetadata, ExpiredExportRef. BulkExportStorage trait grows start_export(StartExportInput), get_export_manifest -> RawExportManifest, plus get_export_job_metadata /get_export_file_metadata /count_active_exports / list_expired_exports. GroupExportProvider grows get_group_members_with_periods (default impl + SQLite/Postgres overrides) so the _since-newly-added filter can read Group.member.period.start.
core/bulk_export_output.rs: new ExportOutputStore trait + ExportPartKey (with embedded fencing_token), ExportPartWriter, FinalizedPart, DownloadUrl. Decouples where the bytes go from job state.
core/bulk_export_worker.rs: new ExportClaimStrategy, fully-fenced ExportWorkerStorage (every mutation guarded by (worker_id, fencing_token); 0 affected rows ⇒ LeaseError::LeaseLost), BulkExportJobStore marker trait, DefaultExportWorker<Js, Dp, Os> runtime that drives a claimed job under its lease, applies _typeFilter/ _since / _until / _elements, resumes from persisted cursors, and honors since_newly_added=exclude. The worker now branches on (level, patient_refs): Patient + non-empty patient_refs delegates to fetch_patient_compartment_batch so POST /Patient/$export?patient=… actually scopes to those patients (previously it ignored the filter and returned every resource of each requested type).
BulkExportError::LeaseLost variant.

Persistence — backends

SQLite (backends/sqlite/): v7→v8 schema migration (lease columns + part_index/fencing_token on bulk_export_files + 0-based-sequential part_index backfill before the unique index); ExportClaimStrategy (process-local mutex), fenced ExportWorkerStorage, get_group_members_with_periods, nested-Group flattening with cycle guard. Patient-level export query parameter binding fixed.
PostgreSQL (backends/postgres/): v7→v8 migration (ADD COLUMN IF NOT EXISTS + ROW_NUMBER() backfill); PostgresSkipLocked claim via SELECT … FOR UPDATE SKIP LOCKED; fenced ExportWorkerStorage; correct int4 / int8 bind sites for bulk_export_progress / bulk_export_files. Cursor timestamps are now parsed with DateTime::parse_from_rfc3339 and bound as DateTime so the wire type matches the inferred TIMESTAMPTZ — without this fix, every paginated export job failed on its second fetch_export_batch with a TEXT/TIMESTAMPTZ type-mismatch error.
S3 (backends/s3/): removed BulkExportStorage impl + synchronous run_export_job (S3 is output-only — job state lives in SQLite/Postgres); kept ExportDataProvider; added stub Patient/GroupExportProvider. New S3OutputStore (multipart upload to MinIO/S3, pre-signed GET via new S3Api::presign_get over aws_sdk_s3::presigning::PresigningConfig).
Local FS (backends/local_fs/): new LocalFsOutputStore (tokio::fs + .tmp→atomic rename, idempotent delete).
MongoDB/Elasticsearch: stub ExportDataProvider/Patient/GroupExportProvider returning UnsupportedCapability.
CompositeStorage: gains export_provider: Option set by with_full_primary (now bounded T: GroupExportProvider); delegates the three traits to the primary or returns UnsupportedCapability.

REST (helios-rest)

bulk_export_auth.rs: ExportFileAuth trait + BearerScopeAuth default (ownership against job_owner_subject or system/* wildcard, system/{ResourceType}.rs scope check; None principal short-circuits when auth is disabled).
handlers/bulk_export.rs: three route-specific kick-off wrappers (system_/patient_/group_export_kickoff_handler) over a shared kickoff_export; status / cancel / download. Parses repeated query params via url::form_urlencoded, validates _typeFilter (rejects result-control params), enforces SmartScopePolicy per requested resource type + Group, enforces the per-tenant cap via count_active_exports, builds StartExportInput with frozen kick-off metadata, assembles the wire ExportManifest from RawExportManifest + ExportOutputStore::download_url, runs the two-step output-then-job teardown on cancel, and emits audit events at every lifecycle step.
state.rs: AppState gains Arc + Arc + Arc + Arc + with_bulk_export(...).
config.rs: BulkExportConfig with the full HFS_BULK_EXPORT_* env surface and validation (rejects local-fs + requires_access_token=false).
routing/fhir_routes.rs: routes registered before the /{resource_type} catch-all; adds ExportDataProvider + PatientExportProvider + GroupExportProvider to the router's S bound.
lib.rs: new create_app_with_auth_and_bulk_export(storage: Arc<S>, …, BulkExportBundle) sharing the inner build_app.
handlers/capabilities.rs: advertises $export system-level operations + per-resource Patient.$export / Group.$export + IG instantiates.
handlers/compartment.rs: refactored to call helios_fhir::get_compartment_params.

Auth (helios-auth)

New DisabledJtiCache — no-op JtiCache implementation that disables the JWT replay-protection cache. Re-exported from the crate root and selectable in HFS via the existing HFS_AUTH_JTI_BACKEND setting; lets deployments that don't require replay protection (or that handle it upstream) avoid a Redis/SQLite dependency.
discovery.rs: SMART well-known metadata now advertises token_endpoint_auth_signing_alg_values_supported, code_challenge_methods_supported (S256), and adds authorization_code to grant_types_supported when an authorization endpoint is configured. Required by Inferno SMART App Launch IG STU2 / Backend Services discovery checks; without these, the SMART Backend Services Inferno group fails the well-known capability test before it can establish a bearer token.

helios-fhir

lib.rs: free get_compartment_params(version, compartment_type, resource_type) dispatching per FhirVersion to the per-version generated lookups.

helios-fhirpath

reference_key_functions.rs: drop a redundant & in a format! argument (clippy nit surfaced once the workspace was built with the bulk-export feature graph).

helios-hfs

main.rs: switched ServerConfig::parse() → ::from_env() so #[arg(skip)] sub-structs (multitenancy, bulk_export) actually populate from env (pre-existing bug). New generic build_bulk_export helper supporting embedded (dedicated SqliteBackend job store + LocalFsOutputStore) and postgres-s3 (PostgresBackend + S3OutputStore); wired into start_sqlite and start_postgres. The embedded backend now create_dir_alls the parent of {HFS_BULK_EXPORT_OUTPUT_DIR}/bulk_export.db before opening the SQLite connection (without this, HFS exited on startup whenever the configured output dir hadn't been pre-created — broke CI smoke jobs that only mkdird RESULTS_DIR). spawn_export_workers launches HFS_BULK_EXPORT_WORKER_CONCURRENCY claim/run loops + a periodic cleanup task that pages list_expired_exports and runs the two-step teardown. Recognizes the disabled JTI backend.
Cargo.toml: adds chrono.

Ops + docs

docker/bulk-export/docker-compose.yml: HFS + Postgres + MinIO + Keycloak; the substrate for the manual Inferno workflow and multi-instance smoke.
.github/workflows/bulk-export-smoke.yml (new): external smoke workflow that brings up HFS in both sqlite/local-fs and postgres/s3 topologies and runs the smoke runner against each on every push.
crates/hfs/tests/bulk_export/run_external_bulk_export_smoke.sh (new, ~500 lines): end-to-end smoke runner —kick-off → status poll → manifest → file download → DELETE cancel → 404 verification. Header parsing uses grep-i instead of gawk-only IGNORECASE, so it works on mawk-based runners (without this, every smoke job silently dropped the Content-Location header and failed the kick-off step).
.github/workflows/inferno-bulk-data.yml: rebuilt against the shared docker-compose stack; runs SMART Backend Services and Bulk Data Export Tests as two sequential test_runs (the suite ID can't be POSTed as a test_group_id — Inferno returns 422, "must be run as part of a group"); carries the smart_auth_info produced by the SMART group into the Export run so kick-offs are authenticated; passes the now-mandatory since_timestamp input (2000-01-01T00:00:00.000Z); seeded heart-rate Observation gains the vital-signs category so R4 profile validation passes; allows Inferno's 5-minute private_key_jwt assertion lifetime on the generated Keycloak client; treats the file-server TLS test as known-omitted in the HTTP-only CI setup; preserves the MinIO client alias across job steps. Suite + group identifiers are still read from kit source at runtime, not hard-coded.
crates/auth/README.md: documents the new disabled JTI backend.
CLAUDE.md: Bulk Data Export endpoint table, full env-var table with defaults, single-instance vs multi-instance recipes, behavior notes.
crates/hfs/README.md: Bulk Data Export quick-start.
ROADMAP.md: $export marked shipped; $bulk-submit (ingestion) called out as next.
Cargo.lock: lettre bumped to address a security audit finding.
codecov.yml: crates/hfs/src/main.rs excluded from coverage (binary entry point not reachable from unit/integration tests).

Testing

cargo fmt --all — green
cargo build (default) — green
cargo clippy -p helios-persistence -p helios-rest -p helios-hfs --features R4,postgres,s3,mongodb,elasticsearch,audit --all-targets -- -D warnings (CLAUDE.md lint allow-list) — green
cargo test -p helios-persistence --features R4 --lib bulk — 49 pass / 0 fail (incl. nested-Group cycle
guard, stale-worker fencing, end-to-end DefaultExportWorker, v7→v8 duplicate-row backfill migration, since_newly_added=exclude filter)
cargo test -p helios-persistence --features R4 --test sqlite_tests — adds a Patient-level export-without-_since integration test (~83 LOC) covering the bug fix
cargo test -p helios-rest --features R4 --test bulk_export — 13 pass / 0 fail (full lifecycle, status mappings, _typeFilter validation, strict/lenient handling, capability statement, metadata-lookup failure paths, plus 5 new integration tests: POST kick-off with Parameters body, _since, invalid _since, _elements, valid _typeFilter)
New unit-test coverage for BearerScopeAuth (5 cases: no-principal bypass, owner + scope, wildcard override, missing-read-scope rejection) and BulkExportConfig::validate() (6 cases — every error branch). Closes the codecov/patch gap (69.70 % → ≥ 75.74 %).
cargo test -p helios-persistence --features postgres,R4 --test postgres_tests --postgres_integration::postgres_integration_export — 3 pass / 0 fail against a real Postgres testcontainer (claim SKIP LOCKED, stale-worker fencing across reclaim, count_active/list_expired). The export-claim test now serializes against a process-wide mutex so it doesn't race other postgres integration tests sharing the same container.
RUN_MINIO_S3_TESTS=1 cargo test -p helios-persistence --features s3,R4 --test minio_s3_tests --test_minio_s3_output_store — 1 pass / 0 fail against MinIO (write → finalize → pre-signed GET → reader → idempotent delete)
External smoke workflow (bulk-export-smoke.yml): runs on every push. Brings up HFS in sqlite/local-fs and postgres/s3 topologies, executes the full kick-off → poll → download → cancel → 404 lifecycle in each; both jobs green.
Multi-instance smoke (manual): brought up Postgres + MinIO + two release/hfs instances on ports 8080/8081 sharing Postgres job state. Kicked off /Patient/$export against instance 1 (202 + Content-Location); polled the status URL on instance 2 (202 then 200 + manifest); manifest carried requiresAccessToken: false and a pre-signed AWS-SHA256 S3 URL pointing at MinIO; downloaded directly from MinIO (5 NDJSON lines); DELETE on instance 1 → 202; subsequent poll on instance 2 → 404.
Inferno Bulk Data IG v2.0.0: cloned inferno-framework/bulk-data-test-kit, executed bundle exec inferno execute --suite bulk_data_v200 against the stack. Result: 16 leaf pass / 8 leaf fail / 44 skip; inferno execute exit 3. All 12 bulk-data export-side leaf tests passed (system / patient / group $export + Content-Location, capability advertisement on each level, cancel 202, post-cancel poll 404). The 8 leaf failures are SMART Backend Services (1.1.02, 1.2.02–1.2.05) and TLS (2.1.01) prerequisites — known-deferred environmental requirements that need a configured Keycloak realm with HFS_AUTH_ENABLED=true and HTTPS termination; both are wired into docker/bulk-export/ for production but were not enabled for this local run. Skips are dependent tests Inferno auto-skips when their kick-off chain is broken by a SMART/TLS skip.

Notes

Migrations: schema bumps to v8 on both SQLite and PostgreSQL. Forward-only. The bulk_export_files.part_index backfill runs before the new unique index is created, so existing deployments with multiple file rows per (job, file_type, resource_type) upgrade cleanly. A focused migration test (test_migration_v7_to_v8_backfills_duplicate_file_rows) covers the duplicate-row case.
Trait contract changes: BulkExportStorage::start_export and get_export_manifest signatures changed; in-tree backends (SQLite, Postgres, S3) are updated. External downstream impls of BulkExportStorage will need to adopt StartExportInput and RawExportManifest.
S3 backend posture: S3 is now output-only for bulk export. Its BulkExportStorage impl was removed; only ExportDataProvider remains. Job state must live in SQLite (HFS_BULK_EXPORT_BACKEND=embedded) or PostgreSQL (postgres-s3). An HFS_STORAGE_BACKEND=s3 deployment now picks up bulk export through the embedded SQLite job store with no additional config.
Auth: bulk export endpoints sit inside the existing auth middleware. BearerScopeAuth validates download requests against the job owner subject or a system/*.rs wildcard. With HFS_AUTH_ENABLED=false (default), enforcement is bypassed — matching the rest of HFS. The new disabled JTI backend lets deployments that don't need replay protection skip Redis/SQLite for the JTI cache; SMART discovery additions (token_endpoint_auth_signing_alg_values_supported, S256, authorization_code) are required for Inferno SMART Backend Services / STU2 conformance.
Patient-level export filter: POST /Patient/$export?patient=Patient/123 now actually scopes to the listed patient compartments. Before this fix, the patient_refs field on ExportRequest was populated but never consulted by the worker, so every resource of every requested type was returned.
Postgres cursor binding: the keyset cursor's RFC 3339 timestamp is now parsed and bound as DateTime in fetch_export_batch / fetch_patient_compartment_batch. Without this, the second batch of any paginated export against PostgreSQL failed with a TEXT/TIMESTAMPTZ type-mismatch — which only ever showed up on jobs large enough to exceed HFS_BULK_EXPORT_BATCH_SIZE.
Pre-existing config fix: main() now uses ServerConfig::from_env() instead of ::parse(). This was needed because multitenancy and bulk_export are #[arg(skip)] for clap and were therefore never populated from env in the binary; previously HFS_TENANT_* env vars also weren't fully reaching the binary through this code path. Behavior change: env-derived multitenancy + bulk-export config now actually applies.
Embedded job-store path: {HFS_BULK_EXPORT_OUTPUT_DIR}/bulk_export.db is created on demand — the bootstrap now create_dir_alls the parent before opening SQLite, rather than requiring callers to pre-create it.
Smoke runner portability: header parsing in run_external_bulk_export_smoke.sh uses grep -i | sed | tr -d '\r' instead of awk … IGNORECASE=1 so it runs on mawk (default awk on the self-hosted runners) as well as gawk.
Inferno workflow: workflow_dispatch-only (matches the existing inferno-us-core.yml / inferno-subscription.yml). The local execution above uses HTTP without auth; CI runs against the full docker/bulk-export/ stack with Keycloak + HTTPS. The workflow now POSTs SMART Backend Services and Bulk Data Export Tests as two sequential test_runs (Inferno rejects the suite ID as a test_group_id), passes smart_auth_info from the first into the second so kick-offs are authenticated, supplies the now-mandatory since_timestamp, and seeds an Observation that satisfies the Heart Rate profile. Suite ids are read from kit source so they don't drift.
since_newly_added=exclude uses Group.member.period.start to filter "patients added after _since". Default is include (return everything).
Worker concurrency / leasing: defaults to 2 workers per pod, 60-second leases with 20-second heartbeats; tunable via HFS_BULK_EXPORT_*. Stale-worker fencing is verified by integration tests on both SQLite and Postgres.
Dependency bump: lettre updated in Cargo.lock to clear a security audit finding.

Implements Discussion #104.

Free function exposed at crate root that dispatches per FhirVersion to the existing helios_fhir::{r4,r4b,r5,r6}::get_compartment_params helpers. Lets persistence reuse the lookup without depending on helios-rest.

…t handler Drops the private get_compartment_params_for_version wrapper in favor of the new shared dispatch on the helios-fhir crate.

Returned by fenced ExportWorkerStorage methods when a stale worker's mutation is rejected because the job has been reclaimed.

- ExportRequest gains until / elements / include_associated_data / patient_refs - ExportManifest gains deleted / link (IG-required) - New StartExportInput bundles kickoff metadata (transaction_time, request_url, owner_subject, fhir_version) - New RawExportManifest / RawManifestEntry: storage-side manifest carrying ExportPartKey rather than wire URLs - New ExportJobMetadata, ExportFileMetadata, ExpiredExportRef - New GroupExportProvider::get_group_members_with_periods (default impl derived from get_group_members) so backends can surface Group.member.period.start for the _since-newly-added filter - BulkExportStorage gains start_export(StartExportInput) signature, RawExportManifest return, get_export_job_metadata, get_export_file_metadata, count_active_exports, list_expired_exports

ExportPartKey (with embedded fencing_token), ExportPartWriter (line + byte counter over a boxed AsyncWrite), FinalizedPart, DownloadUrl, and the ExportOutputStore trait. Decouples 'where the bytes go' from the job-state backend.

…rker - WorkerId, ExportJobLease (with fencing_token), LeaseError - ExportClaimStrategy: claim_next + heartbeat + release - ExportWorkerStorage: every method fenced by (worker_id, fencing_token) so a stale worker cannot mutate progress, file rows, or terminal status after its lease has been reclaimed - BulkExportJobStore marker trait (BulkExportStorage + ExportWorkerStorage + ExportClaimStrategy) for bootstrap-time selection of the job store - DefaultExportWorker drives a claimed job to completion under its lease, applying _typeFilter / _since / _until / _elements, supporting resume from the persisted cursor, and honoring since_newly_added=exclude via Group.member.period.start

…umns bulk_export_jobs: worker_id, lease_expiry, fencing_token, heartbeat_at, owner_subject, request_url, fhir_version + idx_export_jobs_claim. bulk_export_files: part_index, fencing_token + a backfill that assigns 0-based sequential part_index per (job_id, file_type, resource_type) before creating the unique idx_export_files_part. Includes test exercising the duplicate-row backfill case.

- start_export(StartExportInput): persists frozen kickoff metadata - get_export_manifest -> RawExportManifest assembled from rows - get_export_job_metadata / get_export_file_metadata - count_active_exports / list_expired_exports - ExportClaimStrategy via process-local mutex + INSERT/UPDATE - ExportWorkerStorage: every mutation fenced by worker_id + fencing_token (UPDATE … WHERE worker_id=? AND fencing_token=? for terminals, WHERE EXISTS-guarded ON CONFLICT upserts for progress + file rows) - get_group_members_with_periods reads Group.member.period.start - resolve_group_patient_ids flattens nested Groups with a cycle guard - Tests: stale-worker fencing, claim/lifecycle, group-cycle, since_newly_added

ALTER TABLE bulk_export_jobs ADD COLUMN IF NOT EXISTS … for the lease fields, owner_subject, request_url, fhir_version. ALTER bulk_export_files for part_index + fencing_token; ROW_NUMBER() backfill before the unique idx_export_files_part.

PostgresSkipLocked claim strategy (FOR UPDATE SKIP LOCKED inside a transaction), fully-fenced ExportWorkerStorage (every mutation guarded by worker_id + fencing_token), all new BulkExportStorage methods, get_group_members_with_periods + nested-Group flattening with cycle guard. Bind sites use i32 / i64 to match the actual column types on bulk_export_progress / bulk_export_files.

Default impl reports unsupported; AwsS3Client overrides it via PresigningConfig from the AWS SDK. Used by S3OutputStore to mint direct-from-S3 download URLs for the bulk-export manifest.

Reserved for future S3OutputStore integrations; unused now that S3 is output-only and keys live in S3OutputStore::object_key.

S3 is no longer a bulk-export job-state backend; the model is preserved for a future read-modify-write integration.

Reserved for future S3OutputStore integration; unused now that the synchronous BulkExportStorage path has been removed.

S3 is output-only for bulk export — job state lives in SQLite or PostgreSQL. Drops the synchronous start_export / run_export_job path and adds stub PatientExportProvider / GroupExportProvider impls returning UnsupportedCapability so an S3-resource-storage deployment satisfies the trait hierarchy.

ExportOutputStore impl backed by AwsS3Client. open_writer returns a local scratch tempfile; finalize_part fsyncs + put_object's it to S3 under {tenant}/exports/{job_id}/{file_type}-{rt}-{part}-{token}.ndjson. download_url either pre-signs (Auto / AlwaysPresigned) or returns an HFS-served URL (AlwaysToken). delete_job_outputs lists + deletes by prefix. AccessTokenMode encodes the requires_access_token posture.

bulk_export_start_manifest_and_delete is gone (the impl was removed); bulk_export_invalid_format_and_fetch_batch_cursor is reduced to the fetch_export_batch cursor case which still exercises ExportDataProvider.

postgres_integration_export_claim_skip_locked: claim ordering, fencing token bumps. postgres_integration_export_stale_worker_fenced_out: LeaseLost on every fenced ExportWorkerStorage call after reclaim. postgres_integration_export_count_active_and_expire: count + list filtering. claim_specific helper drains foreign jobs so tests can cope with the shared SHARED_PG container.

…add S3OutputStore round-trip The lifecycle test now exercises the remaining ExportDataProvider surface. Adds test_minio_s3_output_store_round_trip: write → finalize → pre-signed GET → open_reader → idempotent delete against MinIO.

…ort_batch S3 is no longer a bulk-export job-state backend; verify the ExportDataProvider data feed instead.

ExportOutputStore impl backed by tokio::fs. open_writer creates a .tmp under ${HFS_DATA_DIR}/exports/{tenant}/{job_id}/, finalize_part fsyncs + atomic rename, download_url returns an HFS-served URL with requires_access_token=true, open_reader serves the file, and delete_job_outputs is idempotent. Includes a write→finalize→read→delete round-trip test.

ExportDataProvider / PatientExportProvider / GroupExportProvider impls returning UnsupportedCapability so MongoDB can satisfy the trait hierarchy without supporting bulk export as a primary.

CompositeStorage gains an export_provider: Option<DynGroupExportProvider> field set by with_full_primary (with the new GroupExportProvider bound on T). Each trait method delegates to the primary or returns UnsupportedCapability when no primary impl is wired in.

Authorizes the HFS-served (requires_access_token=true) download path using the helios_auth Principal — checks ownership against job_owner_subject (or system/* wildcard) plus a system/{ResourceType}.rs scope. Pre-signed downloads bypass HFS and never reach this trait.

bulk_export_jobs: Arc<dyn BulkExportJobStore>, bulk_export_output: Arc<dyn ExportOutputStore>, bulk_export_file_auth: Arc<dyn ExportFileAuth>, plus an Arc<BulkExportConfig>. New with_bulk_export(...) builder and accessors so handlers can reach the subsystem behind feature toggles without touching the resource-storage S type parameter.

Full configuration surface: enabled, backend (embedded|postgres-s3), output_backend (local-fs|s3), output_dir, s3_bucket, requires_access_token (auto|true|false), file_url_ttl_secs, output_ttl_secs, worker_concurrency, disable_local_worker, max_concurrent_per_tenant, batch_size, lease_duration_secs, heartbeat_interval_secs, cleanup_interval_secs, since_newly_added (include|exclude). validate() rejects local-fs + requires_access_token=false (no pre-signed URL capability).

Upgrade astral-tokio-tar from 0.6.1 to 0.6.2 in Cargo.lock to clear RUSTSEC-2026-0145, which is pulled in through testcontainers.

# Conflicts: # .github/workflows/bulk-export-smoke.yml

smunini

Looks better - see my comment here and the pending comment about validate

Restore the SQLite/PostgreSQL Elasticsearch composite rows as full lifecycle smoke coverage, while keeping MongoDB and S3 primary backends explicit as unsupported cases until their bulk-export provider behavior exists.

smunini · 2026-05-26T15:19:24Z

            {"backend":"sqlite","bulk_mode":"embedded-local","expectation":"full"},
            {"backend":"sqlite","bulk_mode":"postgres-s3","expectation":"full"},
            {"backend":"postgres","bulk_mode":"embedded-local","expectation":"full"},
            {"backend":"postgres","bulk_mode":"postgres-s3","expectation":"full"},


postgres-s3 is not a valid backend config

smunini · 2026-05-26T15:19:49Z

-            {"backend":"postgres-elasticsearch","bulk_mode":"embedded-local","expectation":"endpoint-unavailable"},
-            {"backend":"postgres-elasticsearch","bulk_mode":"postgres-s3","expectation":"endpoint-unavailable"},
+            {"backend":"sqlite-elasticsearch","bulk_mode":"embedded-local","expectation":"full"},
+            {"backend":"sqlite-elasticsearch","bulk_mode":"postgres-s3","expectation":"full"},


postgres-s3 is not a valid backend config

Use export_topology to distinguish the bulk export job/output topology from the persistence backend matrix, and keep unsupported runtime modes as endpoint-unavailable coverage. Configure AWS OIDC for postgres-s3 export rows, validate HFS_S3_EXPORT_BUCKET access, pass the real export bucket into HFS, and clean each job's isolated tenant export prefix after the smoke run.

…e/bulk-export

The bulk-export subsystem previously took a separate `HFS_BULK_EXPORT_BACKEND` (embedded|postgres-s3) plus its own `HFS_BULK_EXPORT_DATABASE_URL`, and constructed a fresh SqliteBackend or PostgresBackend internally — even when the FHIR side already had one configured. This change drops both env vars and has `build_bulk_export()` accept the job-state store as a parameter, so the FHIR backend's existing instance (and connection pool) is reused. Composite startup paths (sqlite-elasticsearch, postgres-elasticsearch) now also wire bulk export, passing the underlying relational primary as both data provider and job store while the composite continues to serve the rest of the API. CI smoke matrix collapses from a two-dimensional `{backend, export_topology}` into `{backend, output}` (output: local/s3/none); job state is no longer a matrix knob because it's derived from the FHIR backend. 15 rows × 3 FHIR versions. mongodb/s3 rows still depend on backend-side ExportDataProvider / BulkExportJobStore work that is not in this change.

…on-relational backends Implements the three bulk-export read-side traits for backends that previously stubbed them out: - MongoDB: `ExportDataProvider`, `PatientExportProvider`, and `GroupExportProvider` over the resources collection. Keyset pagination on (last_updated, id) using a $or filter, matching the SQLite/Postgres cursor format. Patient compartment lookup uses the search_index collection (subject/patient → Patient/<id>) when the resource type isn't Patient itself. - S3: completes the `PatientExportProvider` and `GroupExportProvider` stubs. Since S3 has no search index, the compartment scan iterates the resource_type prefix and inspects `subject.reference`/`patient.reference` on each resource's JSON. `get_group_members` reads `Group/<id>/current.json` and parses `member[].entity.reference`. For job state, MongoDB and S3 lack transactional/atomic-claim semantics suitable for `BulkExportJobStore`, so the mongodb, mongodb-elasticsearch, and s3-elasticsearch startup paths now pair the primary backend with an embedded SQLite sidecar (`./data/bulk_export.db`) via a new `build_embedded_job_store` helper. The principle of "reuse the primary's config" still holds for SQLite/Postgres backends; for the others, the sidecar is HFS-owned local state and requires no extra configuration. Pure-S3 (no Elasticsearch) remains intentionally unwired — without a search index there's no scalable enumeration, matching the workflow matrix's `endpoint-unavailable` row.

The previous implementation looked up compartment members via the search_index collection, but mongodb-elasticsearch deployments offload search and leave search_index empty — so patient/group exports of non-Patient resource types returned nothing. Switched to querying the resources collection directly using dot notation on data.subject.reference / data.patient.reference. Works identically whether or not search is offloaded.

When using the embedded SQLite sidecar job store (mongodb/s3 backends), default the DB path to `${HFS_BULK_EXPORT_OUTPUT_DIR}/bulk_export.db` instead of `${HFS_DATA_DIR}/bulk_export.db`. This prevents parallel HFS instances with different output directories (e.g. CI smoke jobs running with max-parallel: 2) from racing on the same SQLite file. Output dir is already unique per smoke job via `RESULTS_DIR/export-output`.

…ured Adds a third fallback to `build_embedded_job_store`: when neither `HFS_BULK_EXPORT_OUTPUT_DIR` nor `HFS_DATA_DIR` is set, use `${TMPDIR}/hfs-bulk-export-{pid}.db` so parallel HFS processes (e.g. CI smoke jobs on the same runner with `max-parallel: 2`) don't race on a single `./data/bulk_export.db` file. Affects mongodb / mongodb-elasticsearch / s3-elasticsearch backends with `output=s3` only — the other rows already get a unique output_dir.

The mongodb-elasticsearch startup path was the only composite that didn't call start_sync_workers() — leftover from when its row was endpoint-unavailable in CI and full smoke coverage didn't run. With default async sync mode, writes enqueue events that never get drained without a worker, so ES never sees seeded resources. Mirrors the sqlite-elasticsearch / postgres-elasticsearch / s3-elasticsearch startup paths.

…nt config Two fixes surfaced by full smoke coverage of the composite backends: 1. Patient/group compartment lookup for non-Patient resource types previously JOINed the search_index table. When search is offloaded to Elasticsearch (sqlite-elasticsearch / postgres-elasticsearch), that table is empty, so Observations were never returned. Now both SQLite (json_extract over the data BLOB) and Postgres (data #>> '{subject, reference}' over JSONB) read the resource payload directly, which is correct whether or not search is offloaded — matching the MongoDB implementation. Added a regression test that force-offloads search. 2. The primary S3 store never read an endpoint env var, so it always targeted real AWS even against MinIO. start_s3 / start_s3_elasticsearch now honor HFS_S3_ENDPOINT, HFS_S3_FORCE_PATH_STYLE, and HFS_S3_ALLOW_HTTP, mirroring the bulk-export output store. The smoke workflow sets these for the s3 / s3-elasticsearch backends so writes hit MinIO.

s3-elasticsearch points its primary store at MinIO via AWS_* env creds, which collide with the real-AWS creds the s3 output store needs (a single process shares one AWS SDK credential chain). The S3 output path is already validated by the sqlite/postgres rows, so s3-elasticsearch only needs output=local to exercise S3-as-primary bulk export.

…imary store [skip ci]

…n crate READMEs [skip ci] Move bulk-export env-var docs into helios-rest's README (matching the project convention of documenting config in crate READMEs) and surface $export from the root README's env table, Features list, and Core Components. Drop two stale rows (HFS_BULK_EXPORT_BACKEND, HFS_BULK_EXPORT_DATABASE_URL) from the docker compose guide — neither is read by the code, which reuses HFS_STORAGE_BACKEND/HFS_DATABASE_URL for job state.

aacruzgon added 30 commits May 15, 2026 11:44

feat(fhir): add version-agnostic get_compartment_params dispatch

c940b5a

Free function exposed at crate root that dispatches per FhirVersion to the existing helios_fhir::{r4,r4b,r5,r6}::get_compartment_params helpers. Lets persistence reuse the lookup without depending on helios-rest.

refactor(rest): use helios_fhir::get_compartment_params in compartmen…

75ded9a

…t handler Drops the private get_compartment_params_for_version wrapper in favor of the new shared dispatch on the helios-fhir crate.

feat(persistence): add BulkExportError::LeaseLost variant

d236e5c

Returned by fenced ExportWorkerStorage methods when a stale worker's mutation is rejected because the job has been reclaimed.

feat(persistence): re-export new bulk-export traits and types from core

b881b46

feat(s3): add S3Api::presign_get for pre-signed download URLs

86c76ae

Default impl reports unsupported; AwsS3Client overrides it via PresigningConfig from the AWS SDK. Used by S3OutputStore to mint direct-from-S3 download URLs for the bulk-export manifest.

chore(s3): #[allow(dead_code)] on legacy bulk-export keyspace helpers

e8bea5a

Reserved for future S3OutputStore integrations; unused now that S3 is output-only and keys live in S3OutputStore::object_key.

chore(s3): #[allow(dead_code)] on ExportJobState

d4f4617

S3 is no longer a bulk-export job-state backend; the model is preserved for a future read-modify-write integration.

chore(s3): #[allow(dead_code)] on internal delete_object helper

d6d667f

Reserved for future S3OutputStore integration; unused now that the synchronous BulkExportStorage path has been removed.

chore(s3): register output_store module + re-export public types

8f55fd9

test(s3): trim removed-impl tests; keep ExportDataProvider coverage

49856e0

bulk_export_start_manifest_and_delete is gone (the impl was removed); bulk_export_invalid_format_and_fetch_batch_cursor is reduced to the fetch_export_batch cursor case which still exercises ExportDataProvider.

test(s3): swap removed start_export/get_export_manifest for fetch_exp…

349bab7

…ort_batch S3 is no longer a bulk-export job-state backend; verify the ExportDataProvider data feed instead.

feat(mongodb): add bulk-export trait stubs

aaf48d0

ExportDataProvider / PatientExportProvider / GroupExportProvider impls returning UnsupportedCapability so MongoDB can satisfy the trait hierarchy without supporting bulk export as a primary.

chore(mongodb): register bulk_export stub module

eb3448d

chore(persistence): expose local_fs backend module

81d4b5e

aacruzgon and others added 6 commits May 19, 2026 11:51

docs(bulk-export): clarify compose stack is local example

b06997f

docs(bulk-export): clarify compose workflow usage

501aba7

fix(deps): update astral-tokio-tar for audit

c3a5447

Upgrade astral-tokio-tar from 0.6.1 to 0.6.2 in Cargo.lock to clear RUSTSEC-2026-0145, which is pulled in through testcontainers.

ci(bulk-export): limit smoke matrix to implemented backends

332bdd5

Merge remote-tracking branch 'origin/main' into feature/bulk-export

bf7d2e4

# Conflicts: # .github/workflows/bulk-export-smoke.yml

Merge remote-tracking branch 'origin/main' into feature/bulk-export

965c2da

smunini requested changes May 22, 2026

View reviewed changes

Comment thread .github/workflows/bulk-export-smoke.yml

ci(bulk-export): restore smoke backend coverage

1add1c6

Restore the SQLite/PostgreSQL Elasticsearch composite rows as full lifecycle smoke coverage, while keeping MongoDB and S3 primary backends explicit as unsupported cases until their bulk-export provider behavior exists.

smunini reviewed May 26, 2026

View reviewed changes

Comment thread .github/workflows/bulk-export-smoke.yml Outdated

smunini reviewed May 26, 2026

View reviewed changes

Comment thread .github/workflows/bulk-export-smoke.yml Outdated

smunini and others added 15 commits May 26, 2026 11:30

Removed $validate advertisement - we don't support that yet

bc5089e

Merge remote-tracking branch 'origin/feature/bulk-export' into featur…

6447274

…e/bulk-export

style(rest): apply rustfmt to capabilities handler

7fc75f7

docs(s3): document HFS_S3_ENDPOINT / force-path-style env vars for pr…

7e98bbb

…imary store [skip ci]

Merge branch 'main' into feature/bulk-export

c0b3eb6

smunini approved these changes May 28, 2026

View reviewed changes

smunini merged commit d54e202 into main May 28, 2026

smunini deleted the feature/bulk-export branch June 3, 2026 14:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(persistence,rest,hfs): FHIR Bulk Data Export ($export) — async kick-off, postgres-s3 multi-instance, Inferno v2.0.0#108

feat(persistence,rest,hfs): FHIR Bulk Data Export ($export) — async kick-off, postgres-s3 multi-instance, Inferno v2.0.0#108
smunini merged 100 commits into
mainfrom
feature/bulk-export

aacruzgon commented May 15, 2026 •

edited

Loading

Uh oh!

smunini left a comment

Uh oh!

Uh oh!

smunini May 26, 2026

Uh oh!

smunini May 26, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

aacruzgon commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

FHIR Bulk Data Export

Why

Changes

Persistence — new traits + types (helios-persistence)

Persistence — backends

REST (helios-rest)

Auth (helios-auth)

helios-fhir

helios-fhirpath

helios-hfs

Ops + docs

Testing

Notes

Uh oh!

smunini left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

smunini May 26, 2026

Choose a reason for hiding this comment

Uh oh!

smunini May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aacruzgon commented May 15, 2026 •

edited

Loading