Skip to content

feat: Add query-index-optimizer skill for EDS content ops#189

Open
focusgts wants to merge 2 commits into
adobe:mainfrom
focusgts:feat/eds-query-index-optimizer
Open

feat: Add query-index-optimizer skill for EDS content ops#189
focusgts wants to merge 2 commits into
adobe:mainfrom
focusgts:feat/eds-query-index-optimizer

Conversation

@focusgts

Copy link
Copy Markdown
Contributor

Summary

Adds the query-index-optimizer skill to the EDS content ops plugin.

Audits and tunes the query index — analyzes indexed properties against actual usage, checks index size and pagination, and generates helix-query.yaml recommendations.

Follows the established format (functional description, External Content Safety, concrete code examples, reference file for progressive disclosure, Apache-2.0). Submitted as a standalone PR per @trieloff's request to keep one skill per PR.

Test plan

  • tessl-review passes (≥80

@trieloff

Copy link
Copy Markdown
Contributor

@dominique-pfister this seems to be your area of expertise, can you take a look, please?


### Key Concepts

- **helix-query.yaml** — Lives in the GitHub repo root. Defines which properties to index and how they are sourced (from metadata, headings, or content).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer true: it lives in the configuration, see: https://www.aem.live/developer/indexing#setting-up-an-initial-index-with-the-index-admin-tool

### Key Concepts

- **helix-query.yaml** — Lives in the GitHub repo root. Defines which properties to index and how they are sourced (from metadata, headings, or content).
- **query-index.json** — The live JSON endpoint. Returns an array of page entries with the indexed properties.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just the default name of the query index, it can also be named differently.

- **helix-query.yaml** — Lives in the GitHub repo root. Defines which properties to index and how they are sourced (from metadata, headings, or content).
- **query-index.json** — The live JSON endpoint. Returns an array of page entries with the indexed properties.
- **Consumers** — Blocks and components that fetch `query-index.json` to build dynamic lists: navigation, footer, card lists, search results, recent posts, tag-filtered collections.
- **Default limit** — The index returns a maximum of 500 entries by default. Sites with more pages need to paginate or increase the limit.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong, it returns 1000 entries by default, as documented here: https://www.aem.live/developer/spreadsheets#offset-and-limit

- **query-index.json** — The live JSON endpoint. Returns an array of page entries with the indexed properties.
- **Consumers** — Blocks and components that fetch `query-index.json` to build dynamic lists: navigation, footer, card lists, search results, recent posts, tag-filtered collections.
- **Default limit** — The index returns a maximum of 500 entries by default. Sites with more pages need to paginate or increase the limit.
- **Index freshness** — The index updates when pages are previewed or published via Sidekick. Unpublished pages remain in the index until explicitly removed.

@dominique-pfister dominique-pfister Jun 29, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong, should be:

The index updates when pages are published. (via Sidekick or programmatically does not matter)


## How the EDS Query Index Works

The query index is the primary mechanism for blocks and components to discover and list content in an EDS site. It is configured via a `helix-query.yaml` file in the GitHub repository and served as JSON at `/query-index.json`.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dominique-pfister dominique-pfister left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already found 4 errors in the first few lines that are repeated later, please fix and align with the current documentation.

…docs

Address @dominique-pfister review:
- Index is configured via the Index Admin tool / Admin API, not a
  helix-query.yaml file in the GitHub repo
- query-index.json is the default index name; indices can be named differently
- Default entry limit is 1000, not 500 (per the spreadsheets doc)
- Pages are indexed on publish, not preview
- Cite the indexing and spreadsheets docs as sources of truth

Co-Authored-By: claude-flow <ruv@ruv.net>
@focusgts

Copy link
Copy Markdown
Contributor Author

Thanks @dominique-pfister — these are exactly right, and I appreciate the careful review with the doc links. I've corrected all four against the current docs: index configured via the Index Admin tool (not a repo helix-query.yaml), query-index.json as the default name, the 1000-entry default limit per the spreadsheets doc, and indexing on publish. I also cited the indexing and spreadsheets docs as sources of truth in the reference file. The fix is pushed.

You're right that these repeat — we're doing a documentation-grounded pass over the rest of our skills to catch the same class of staleness. Appreciate you keeping the bar high.

@focusgts

Copy link
Copy Markdown
Contributor Author

@dominique-pfister @trieloff — your review here prompted us to audit our entire content-ops suite against the current aem.live docs, not just this skill. We found the same class of staleness (the helix-query.yaml → Index Admin change, the 1000-entry default, publish-vs-preview indexing, and a few others such as the current helix-sitemap.yaml schema and the RUM Bundler API shape) in several already-merged skills, and we have doc-grounded corrections ready, each cited to an aem.live page.

Would you prefer one PR per skill (per the usual process here) or a single batched corrections PR? Happy to do whichever is easiest to review. Thanks again — the review made the whole set better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants