Skip to content

Add SearchIndex entity and Search Management API#1391

Draft
BryanFauble wants to merge 1 commit into
developfrom
search-index-entity
Draft

Add SearchIndex entity and Search Management API#1391
BryanFauble wants to merge 1 commit into
developfrom
search-index-entity

Conversation

@BryanFauble
Copy link
Copy Markdown
Member

Summary

Adds support for the new SearchIndex Synapse entity (an OpenSearch-backed full-text / faceted / autocomplete index over a SQL-defined view of any table-like Synapse entity), the SearchManagementController REST surface that powers it, and an internal helper module for stitching its analyzer/synonym configuration together.

Production changes

New api/search_services.py — 20 thin async functions over the SearchManagementController:

  • create_/get_/update_/list_text_analyzer{,s}
  • create_/get_/update_/list_column_analyzer_override{,s}
  • create_/get_/update_/list_synonym_set{,s}
  • create_/get_/update_/list_search_configuration{,s}
  • bind_search_config_to_entity, get_search_config_binding, clear_search_config_binding
  • autocomplete_search

New models/search_index.pySearchIndex entity dataclass following the existing VirtualTable/MaterializedView pattern: dataclass + @async_to_sync, inline SearchIndexSynchronousProtocol, mixin composition (AccessControllable, TableBase, TableStoreMixin, DeleteMixin, GetMixin, QueryMixin). Columns are read-only and derived from defining_sql. store_async raises ValueError if defining_sql is missing.

New models/search_management.py — 14 dataclasses + 5 enums covering the request/response shapes:

  • TextAnalyzer, ColumnAnalyzerOverride{,Entry}, SynonymSet, SearchConfiguration, SearchConfigBinding
  • SearchIndexStatus, SearchIndexState
  • SearchQuery, SearchQueryType, SearchQueryPart, KeyValues, KeyRange, FacetRequest, FacetSortField, SortField, SortDirection, SearchHit, SearchFieldValue
  • SearchIndexQuery riding the AsynchronousCommunicator mixin for /search/query/async

New models/services/search_setup.py — internal-only helpers for the synonym-aware analyzer chain (SynonymSet → cloned TextAnalyzer with $ref → SearchConfiguration with column overrides). Includes clone_settings_with_search_synonyms for splicing the synonym filter into a system analyzer's default_search chain so synonyms are expanded at search time only.

3-way registration per the existing concrete-type convention:

  • core/constants/concrete_types.py — adds SEARCH_INDEX_ENTITY and SEARCH_INDEX_QUERY
  • api/entity_factory.py — adds SEARCH_INDEX_ENTITY → SearchIndex in the type dispatch map
  • models/mixins/asynchronous_job.py — adds SEARCH_INDEX_QUERY → /search/query/async in ASYNC_JOB_URIS
  • models/mixins/table_components.py — adds SearchIndex to CLASSES_WITH_READ_ONLY_SCHEMA
  • models/__init__.py — public exports for the new symbols
  • api/__init__.py — re-exports the new service functions

Test plan

  • Unit tests (98 new, all passing)

    • tests/unit/synapseclient/api/unit_test_search_services.py — 21 tests covering URI/body/verb for every service function and None-stripping for list endpoints
    • tests/unit/synapseclient/models/async/unit_test_search_index_async.py — 11 tests covering field defaults, fill_from_dict round-trip, to_synapse_request shape, store_async validation, super-method delegation
    • tests/unit/synapseclient/models/async/unit_test_search_management_async.py — 41 tests covering every dataclass round-trip, enum coercion, $ref normalization, and the SearchIndexQuery → AsynchronousCommunicator wiring
    • tests/unit/synapseclient/models/async/unit_test_search_setup_async.py — 25 tests covering rule rendering, settings cloning (with all error-path branches), column override derivation, paginated find-by-name, idempotent ensure_* helpers
  • Integration tests (tests/integration/synapseclient/models/async/test_search_index_async.py)

    • TestSearchIndexEntity — full create/get/update/delete lifecycle against a live server, with feature-detection skip if the server doesn't yet expose the entity type
    • TestSearchManagementReadEndpointslist_text_analyzers + get_text_analyzer against the bootstrapped org.sagebionetworks system analyzers (any authenticated caller can hit these)
    • TestSearchManagementWriteEndpoints — gated behind SYNAPSE_SAGE_EMPLOYEE_TOKEN; placeholder for when Sage-employee CI auth is wired up
  • pre-commit run --all-files — clean (ruff, isort, black, bandit)

  • pytest -sv tests/unit/synapseclient/api tests/unit/synapseclient/models — 1167 passing, 0 regressions

Notes

  • Restored project_setting_services imports in api/__init__.py that had been accidentally deleted in the original feature branch.
  • SearchManagementController write endpoints (create/update of TextAnalyzer / ColumnAnalyzerOverride / SynonymSet / SearchConfiguration) are restricted to Sage Bionetworks employees server-side. Integration coverage for those endpoints is gated behind SYNAPSE_SAGE_EMPLOYEE_TOKEN; the unit tests cover the wire format unconditionally.

- New SearchIndex entity model wired through entity_factory and the
  async-job URI map; columns are derived read-only from defining_sql.
- Search Management API surface in api/search_services.py:
  TextAnalyzer, ColumnAnalyzerOverride, SynonymSet, SearchConfiguration,
  SearchConfigBinding, autocomplete_search.
- SearchIndexQuery dataclass riding the AsynchronousCommunicator mixin
  for /search/query/async, plus SearchQuery and supporting filter/facet
  dataclasses.
- Internal models/services/search_setup helpers for upserting the
  synonym-aware analyzer chain (SynonymSet -> TextAnalyzer clone ->
  SearchConfiguration).
- Unit coverage for every public function and dataclass round-trip.
- Integration coverage for the SearchIndex entity lifecycle and the
  read-only Search Management endpoints; write endpoints gated behind
  SYNAPSE_SAGE_EMPLOYEE_TOKEN.
@BryanFauble BryanFauble requested a review from a team as a code owner May 28, 2026 21:44
Copilot AI review requested due to automatic review settings May 28, 2026 21:44
@BryanFauble BryanFauble marked this pull request as draft May 28, 2026 21:44
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class client support for Synapse’s new OpenSearch-backed SearchIndex entity, along with a thin async Search Management API surface, corresponding request/response dataclasses, and helper utilities for synonym-aware analyzer/configuration setup.

Changes:

  • Introduces SearchIndex entity model plus async job model SearchIndexQuery for /search/query/async.
  • Adds api/search_services.py (SearchManagementController endpoints) and internal models/services/search_setup.py helpers for synonym/analyzer/config stitching.
  • Registers new concrete types and wires them into entity factory + async job routing, with unit/integration test coverage.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
synapseclient/api/__init__.py Re-exports new Search Management service functions from the API surface.
synapseclient/api/entity_factory.py Registers SEARCH_INDEX_ENTITYSearchIndex in entity type dispatch.
synapseclient/api/search_services.py Adds thin async wrappers for SearchManagementController endpoints + autocomplete.
synapseclient/core/constants/concrete_types.py Adds concrete types for SearchIndex entity and SearchIndexQuery async job.
synapseclient/models/__init__.py Exports SearchIndex and search management/query models publicly.
synapseclient/models/mixins/asynchronous_job.py Routes SEARCH_INDEX_QUERY to /search/query/async in async job URI map.
synapseclient/models/mixins/table_components.py Marks SearchIndex as having read-only schema in table mixin config.
synapseclient/models/search_index.py Implements the new SearchIndex entity model and async CRUD behavior.
synapseclient/models/search_management.py Adds dataclasses/enums for search management resources and async query request/response shapes.
synapseclient/models/services/search_setup.py Adds idempotent helpers to create/find synonym sets, analyzers, and search configurations.
tests/integration/synapseclient/models/async/test_search_index_async.py Integration coverage for SearchIndex lifecycle + read-only search management endpoints.
tests/unit/synapseclient/api/unit_test_search_services.py Unit tests validating verb/URI/body shapes for the new service functions.
tests/unit/synapseclient/models/async/unit_test_search_index_async.py Unit tests for SearchIndex serialization, change tracking, and store validation.
tests/unit/synapseclient/models/async/unit_test_search_management_async.py Unit tests for search management dataclass round-trips and async communicator wiring.
tests/unit/synapseclient/models/async/unit_test_search_setup_async.py Unit tests for analyzer/synonym setup helper behaviors and error paths.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +119 to +126
overrides: List[Tuple[str, str]] = []
for column_name, column_type in columns.items():
system_default = COLUMN_TYPE_TO_DEFAULT_ANALYZER_QNAME.get(column_type)
if system_default is None:
continue
target = substitutions.get(system_default, system_default)
if system_default == default_substitutes_system_qname:
continue
Comment on lines +139 to +141
if kind == EQUIVALENT:
synonyms.append(", ".join(terms))
elif kind == EXPLICIT:
for token in (
"unsupported entity type",
"concretetype",
"not allowed",
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants