Add reproducible Planetary Computer benchmark fixtures and CI/manual benchmark reporting for load-path comparison by Copilot · Pull Request #445 · stac-utils/pgstac

Copilot · 2026-05-20T21:25:26Z

This PR adds reproducible benchmark fixtures for naip, sentinel-2-l2a, and landsat-c2-l2 (1000 items each), and wires benchmark execution/reporting so split-column+fragment behavior can be compared over time and against main. It introduces deterministic fixture materialization, machine-readable benchmark outputs, and dedicated CI/manual workflows for artifacted reporting.

Fixture source of truth (deterministic, reproducible)
- Added benchmarks/fixtures/planetary-computer/manifest.json defining:
  - API endpoint
  - target collections
  - exact per-collection item count (1000)
  - deterministic search settings
- Added fixture docs and layout under benchmarks/fixtures/planetary-computer/ with generated data intentionally excluded from git.
Fixture materialization + validation
- Added scripts/benchmark_fetch_pc_fixtures.py to:
  - fetch collection docs and items from Planetary Computer
  - enforce exact cardinality and collection membership
  - detect duplicate IDs
  - emit fixture SHA summaries for traceability.
Benchmark execution + machine-readable outputs
- Added scripts/benchmark_run.py to run per-collection:
  - ingest/dehydrate timing
  - hydrate timing
  - storage sizing (table/index/total + global schema/fragment bytes)
- Outputs JSON/CSV/Markdown artifacts suitable for CI retention and trend comparison.
Base-vs-head comparison reporting
- Added scripts/benchmark_compare_results.py to diff two benchmark JSON reports and emit:
  - comparison JSON
  - tabular CSV
  - markdown summary with deltas and percentages.
CI/manual workflows for artifacted benchmarks
- Added .github/workflows/benchmark-fixtures.yml:
  - lightweight fixture-based benchmark run on PR/schedule/manual trigger
  - uploads raw + summarized artifacts
- Added .github/workflows/benchmark-compare.yml:
  - manual base-ref vs head comparison using same fixtures
  - publishes markdown summary and artifact bundle
- Added explicit minimal workflow permissions.
Docs/instructions updates
- Updated CLAUDE.md, AGENTS.md, and .github/instructions/scripts.instructions.md with the new benchmark entrypoints and workflow intent.
- Updated CHANGELOG.md to record benchmark fixture/reporting additions.

uv run --no-project --with psycopg[binary] python scripts/benchmark_fetch_pc_fixtures.py \
  --manifest benchmarks/fixtures/planetary-computer/manifest.json \
  --output-dir /tmp/pgstac-benchmark-fixtures

uv run --no-project --with psycopg[binary] python scripts/benchmark_run.py \
  --fixtures-dir /tmp/pgstac-benchmark-fixtures \
  --repo-root "$PWD" \
  --label head \
  --output-dir /tmp/pgstac-benchmark-results

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

apt.postgresql.org
- Triggering command: /usr/lib/apt/methods/http /usr/lib/apt/methods/http (dns block)
astral.sh
- Triggering command: /usr/bin/curl curl -LsSf REDACTED git conf�� amd64.deb 4.2.0-19_amd64.deb ker/cli-plugins/docker-compose credential.usern/usr/bin/python3.13 (dns block)
planetarycomputer.microsoft.com
- Triggering command: /usr/bin/python python - (dns block)
- Triggering command: /home/REDACTED/work/_temp/ghcca-node/node/bin/node /home/REDACTED/work/_temp/ghcca-node/node/bin/node --enable-source-maps /home/REDACTED/work/_temp/copilot-developer-action-main/dist/index.js (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

Original prompt

Add reproducible benchmark fixtures and CI/manual benchmark reporting for the split-column + fragment work in PR #444.

Repository: stac-utils/pgstac
Branch to target: v010-pr1-load-path-core

Goals:

Add benchmark fixtures containing approximately 1000 items each, plus the collection documents, for these Microsoft Planetary Computer STAC collections:
- naip
- sentinel-2-l2a
- landsat-c2-l2
Use those fixtures to benchmark and report comparisons relevant to the old collection base-item approach on main versus the new instantiated-column + fragment approach on this branch.
Add CI/manual workflows and scripts so benchmark results are captured as artifacts and can be compared over time.

Important constraints and expectations:

The user explicitly wants fixtures of 1000 each for landsat, sentinel, and naip collections and items from the Planetary Computer STAC API.
If committing 3000 raw items directly is too large or awkward, the agent should choose the best practical approach while still satisfying the request as closely as possible. Preferred order:
1. Commit fixed benchmark fixtures into the repo in a dedicated benchmark fixtures location.
2. If that is too large, commit a deterministic fixture manifest and a fetch script that can materialize exactly 1000 items per collection from Planetary Computer, and wire CI/manual workflows to generate them reproducibly.
Benchmarking should cover at least:
- dehydrate/ingest timing
- hydrate timing
- disk/storage size reporting
Results should be machine-readable and suitable for CI artifacts, not just NOTICE output.
Add a lightweight CI benchmark workflow for stable fixture-based benchmarking and artifact upload.
Add a heavier or manual/scheduled benchmark workflow if needed for branch-vs-main comparison.
If feasible, add a branch comparison workflow or script that benchmarks current branch versus main using the same fixtures and emits a markdown summary and raw JSON/CSV results.
Reuse existing repo infrastructure where possible (.github/workflows/continuous-integration.yml, scripts/test, existing benchmark_dehydrate*.sql, docker-based CI).
Keep changes focused on benchmark fixtures/reporting; do not broaden scope unnecessarily.

Suggested deliverables:

benchmark fixture location and files for collection docs and item datasets
scripts to fetch and/or validate Planetary Computer fixtures
SQL benchmark scripts for hydrate + storage plus normalization of dehydrate benchmark outputs
shell/python orchestration for running benchmarks and writing reports
GitHub Actions workflow(s) to run benchmarks and upload artifacts
brief docs/README for how to run the benchmarks locally and in CI

Please create a PR implementing this in stac-utils/pgstac based on branch v010-pr1-load-path-core.

The following is the prior conversation context from the user's chat exploration (may be truncated):

User: so an intense review of the fragment approach we are using with a particular eye towards making sure we are squeezing every bit of performance we can. find some sample landsat, sentinel, and naip collections and items from the Microsoft planetary computer stac api so that we can benchmark the hydration and dehydration speeds for the old collection base item approach that is in main vs the instantiated column and fragment approach we are looking at now. give me a report on both the shows differences as well as the size in disk differences for both approaches

Assistant: I reviewed PR #444 and the current main-branch design it is replacing.

Short version

You’re asking for a real benchmark report, not just a code review, and

User: try again

Assistant: I dug into PR #444 again from the URL you sent.

What changed in PR #444

The old main approach stores a collection-level base_item generated from collection metadata, then dehydrates each item by stripping anything already covered by that base item. Historically that base_item contains at least type, stac_version, assets from item_assets, and collection. You can see that in the existing migration history and schema lineage. The collections table on the old path includes a stored base_item generated column. On hydrate, item JSON is rebuilt by merging the per-item stored JSON with that collection base JSON. This is the “collection base item” approach you referred to.

PR #444 removes that model and replaces it with:

split top-level item storage into dedicated columns like bbox, links, assets, properties, and extra
many promoted scalar/queryable columns such as platform, mission, eo_cloud_cover, proj_epsg, etc.
a new item_fragments table for deduplicated shared JSON chunks
collection-level fragment_config instead of base_item
hydrate logic that reassembles an item from split columns plus an optional fragment row, using jsonb_merge_level1 for assets/properties reassembly
staging-trigger fragment extraction and dedup dur...

This pull request was created from Copilot chat.

Agent-Logs-Url: https://github.com/stac-utils/pgstac/sessions/08cf3ee2-6a60-42ab-b23b-fac084012a4c Co-authored-by: bitner <164828+bitner@users.noreply.github.com>

…benchmark reporting for load-path comparison (#445) * Initial plan * Add benchmark fixture manifests, runners, and CI workflows Agent-Logs-Url: https://github.com/stac-utils/pgstac/sessions/08cf3ee2-6a60-42ab-b23b-fac084012a4c Co-authored-by: bitner <164828+bitner@users.noreply.github.com> * Address benchmark review feedback and tighten workflow permissions Agent-Logs-Url: https://github.com/stac-utils/pgstac/sessions/08cf3ee2-6a60-42ab-b23b-fac084012a4c Co-authored-by: bitner <164828+bitner@users.noreply.github.com> * Polish benchmark scripts after validation feedback Agent-Logs-Url: https://github.com/stac-utils/pgstac/sessions/08cf3ee2-6a60-42ab-b23b-fac084012a4c Co-authored-by: bitner <164828+bitner@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: bitner <164828+bitner@users.noreply.github.com>

Initial plan

70747ab

Copilot AI assigned Copilot and bitner May 20, 2026

Copilot started work on behalf of bitner May 20, 2026 21:25 View session

Copilot AI and others added 3 commits May 20, 2026 21:33

Add benchmark fixture manifests, runners, and CI workflows

7803ffa

Agent-Logs-Url: https://github.com/stac-utils/pgstac/sessions/08cf3ee2-6a60-42ab-b23b-fac084012a4c Co-authored-by: bitner <164828+bitner@users.noreply.github.com>

Address benchmark review feedback and tighten workflow permissions

016ac26

Agent-Logs-Url: https://github.com/stac-utils/pgstac/sessions/08cf3ee2-6a60-42ab-b23b-fac084012a4c Co-authored-by: bitner <164828+bitner@users.noreply.github.com>

Polish benchmark scripts after validation feedback

eeadfbf

Agent-Logs-Url: https://github.com/stac-utils/pgstac/sessions/08cf3ee2-6a60-42ab-b23b-fac084012a4c Co-authored-by: bitner <164828+bitner@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add reproducible benchmark fixtures and CI benchmark reporting~~ Add reproducible Planetary Computer benchmark fixtures and CI/manual benchmark reporting for load-path comparison May 20, 2026

Copilot finished work on behalf of bitner May 20, 2026 21:39

Copilot AI requested a review from bitner May 20, 2026 21:39

bitner marked this pull request as ready for review May 21, 2026 14:27

bitner merged commit 9251e60 into v010-pr1-load-path-core May 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add reproducible Planetary Computer benchmark fixtures and CI/manual benchmark reporting for load-path comparison#445

Add reproducible Planetary Computer benchmark fixtures and CI/manual benchmark reporting for load-path comparison#445
bitner merged 4 commits into
v010-pr1-load-path-corefrom
copilot/add-benchmark-fixtures-ci-reporting

Copilot AI commented May 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

I tried to connect to the following addresses, but was blocked by firewall rules:

Short version

What changed in PR #444

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented May 20, 2026 •

edited

Loading