Skip to content

Document direct querying, helper functions, reproducibility #5

@bbest

Description

@bbest

Context

The docs Quarto book has a comprehensive maps.qmd and a good db.qmd, but api.qmd is a near-empty stub and there's no page on querying the GCS parquet data directly or with the new calcofi4r helpers.

Goal

Document how to query CalCOFI data three ways — direct SQL, calcofi4r helpers, and the int-app download — with a reproducibility story tying them together.

Tasks

  • New data-access.qmd — direct DuckDB + GCS parquet querying: httpfs setup, single-file vs hive-partitioned read_parquet examples, ## Reproducibility section explaining the int-app download query/ folder
  • New helpers.qmd — the calcofi4r bio↔env matching helpers with worked examples
  • Expand api.qmd — add a "superseded by calcofi4r helpers + direct querying" callout
  • _quarto.yml — insert data-access.qmd and helpers.qmd after db.qmd, before api.qmd
  • Recurring worked example across all three pages: "Pacific sardine larvae + temperature, Q1 2023, relaxed matching" shown three ways — direct SQL, cc_match_ichthyo_by_name(), int-app download — all producing identical rows
  • Verify: quarto render docs/; confirm the worked-example SQL matches attr(cc_match_ichthyo_by_name(...), "sql")

Blocked by: CalCOFI/workflows#51, CalCOFI/apps#40, CalCOFI/calcofi4r#10, CalCOFI/int-app#5. Final issue of a 5-issue epic.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions