Skip to content

feat(discovery): selector-driven discover() and discover_labels()#28

Open
soupat wants to merge 6 commits into
mainfrom
updated-hierarchical-discovery
Open

feat(discovery): selector-driven discover() and discover_labels()#28
soupat wants to merge 6 commits into
mainfrom
updated-hierarchical-discovery

Conversation

@soupat
Copy link
Copy Markdown
Collaborator

@soupat soupat commented May 10, 2026

Summary

Replace the legacy hierarchical discovery trio (describe_fleet, list_devices, get_device_functions) with a single selector-driven pair (discover, discover_labels) plus a label foundation. One grammar covers device-only, device.function, device.event, function-only, and event-only queries; same string drives every discovery and operation tool.

Three additive layers:

  1. Labels foundation. Optional labels: dict[str, str | list[str]] field on DeviceCapabilities, FunctionDef, and EventDef, populated via class-level DeviceDriver.labels = {...} or @rpc(labels=...) / @emit(labels=...) decorator kwargs.
  2. Selector DSL. Pure-Python parser at device_connect_edge.selector mapping a structured string onto a Selector dataclass. Supports key:value, key:[v1,v2] (OR within key), key:pattern* (anchored glob), k1:v1,k2:v2 (AND across keys), and bare-string id/name match.
  3. discover() and discover_labels() tools. Selector-driven discovery with stable pagination envelope ({scope, matched, returned, offset, next_offset, results, label_histogram}) and a label_histogram so callers can choose how to narrow next without a second call. Errors returned as data with structured {code, message} for the five failure modes.

flatten_device mirrors the legacy DeviceStatus.location into labels["location"] when capabilities don't declare one, so existing drivers populating only the heartbeat field remain discoverable.

The legacy trio remains for one release as advisory-deprecated wrappers (each emits a DeprecationWarning pointing at the equivalent discover() invocation). All first-party adapters (Claude Agent SDK, Strands, LangChain, the in-tree StrandsOpenAIDeviceConnectAgent) migrated to discover / discover_labels so they don't trigger the warning.

What's new vs main

  • New tools: discover(selector, offset, limit), discover_labels(key, offset, limit)
  • New module: device_connect_edge.selector (parser + matcher, dependency-free stdlib only)
  • Labels field on FunctionDef / EventDef / DeviceCapabilities
  • @rpc(labels=...) / @emit(labels=...) decorator kwargs
  • flatten_device legacy-location mirror
  • Adapters migrated; integration tests added; ADR added at docs/adr/0001-selector-driven-discovery.md

Backwards compatibility

  • describe_fleet / list_devices / get_device_functions still work — they emit a DeprecationWarning pointing to the equivalent discover() call. Existing tests in test_tools_hierarchical.py continue to pass against them.
  • discover_devices() (the long-deprecated flat-roster tool) also gains a DeprecationWarning for parity.
  • Drivers do not need to declare labels to be discoverable; discover("device(*)") returns everything, and the legacy-location mirror keeps location-based queries working for drivers that only set DeviceStatus.location.

Test plan

  • 912 unit tests pass (495 edge + 159 agent-tools + 258 server)
  • 91 integration tests pass on NATS backend (tests/tests/test_tools_selector.py adds 22 new tests covering all five scope shapes, label filters, OR-within-key, AND-across-keys, pagination, error envelope, and discover_labels per-axis + per-key forms)
  • No existing integration test broken
  • CI integration tests on Zenoh backend (skipped locally)

Commits

feat(types): add labels to capabilities, functions, and events
feat(selector): add selector DSL parser and matcher
feat(discovery): selector-driven discover and discover_labels
feat(discovery): structure discover/discover_labels error envelope

soupat added 6 commits May 9, 2026 21:10
Add an optional labels: dict[str, str | list[str]] field on
DeviceCapabilities, FunctionDef, and EventDef. Drivers populate them
either via class-level DeviceDriver.labels = {...} (device metadata)
or @rpc(labels=...) / @emit(labels=...) decorator kwargs. List values
express composite identity (a device that is both camera and inference).

These labels are the foundation for selector-based discovery and
operations: the discover/invoke/broadcast tools filter on them.
Add a pure-Python parser at device_connect_edge.selector that maps a
structured selector string onto a parsed Selector dataclass with five
scope shapes:

    device(<filters>)
    device(<filters>).function(<filters>)
    device(<filters>).event(<filters>)
    function(<filters>)
    event(<filters>)

Inside (...): key:value, key:[v1,v2] (OR within key), key:pattern*
(anchored glob), k1:v1,k2:v2 (AND across keys), bare-string id/name
match, or * to match all.

Parse errors carry source + caret position for diagnostics. The matcher
is dependency-free (stdlib only) and applies vacuous-True semantics on
unset axes so callers can iterate without scope branching.
Add two new agent tools that replace the hierarchical trio:

- discover(selector, offset, limit) resolves a selector to matched
  devices, function tuples, or event tuples. Adaptive response shape:
  small result sets include full schemas inline; large sets paginate
  with name-and-labels summaries (DC_FUNCTION_THRESHOLD=20).
- discover_labels(key, offset, limit) returns the label vocabulary,
  per axis (no key) or paginated values for one key.

Response envelope: {scope, matched, returned, offset, next_offset,
results, label_histogram}. The label_histogram describes the matched
set (pre-pagination) so callers can choose how to narrow next without
a second call. On the device axis, multi-valued keys also expose
unique_devices for cardinality.

flatten_device now mirrors the legacy DeviceStatus.location into
labels["location"] when capabilities.labels does not declare one, so
drivers populating only the heartbeat field remain discoverable via
selector queries on location.

Migrate first-party adapters (Claude Agent SDK, Strands, LangChain,
the in-tree StrandsOpenAIDeviceConnectAgent) to discover/discover_labels.
The legacy describe_fleet/list_devices/get_device_functions trio
remains for one release as advisory-deprecated wrappers; each call
emits a DeprecationWarning pointing to the equivalent discover()
invocation.

Test drivers carry category, direction, modality, and safety labels so
integration tests can exercise the full selector grammar end-to-end.
Errors returned by discover() and discover_labels() are now structured
{"code": ..., "message": ...} dicts rather than free-form strings. This
lets callers branch on the code programmatically while still surfacing
the message to logs or end users.

Codes emitted:
  - invalid_selector         selector is not a string
  - selector_parse_error     selector is a string but malformed
  - connection_error         registry / messaging backend unavailable
  - key_not_axis_qualified   discover_labels key missing axis prefix
  - unknown_axis             discover_labels axis not in
                             {device, function, event}
…ools

The doc is a developer guide rather than a decision record: drop the
"ADR 0001:" framing, status line, and motivation paragraph. Trim the
content to the discovery surface that ships with this PR (labels,
selector grammar, discover, discover_labels, response envelope, error
codes) so worked examples are runnable today.
Trim docs/discovery.md to the discovery surface that ships in this PR
(labels, selector grammar, discover, discover_labels, response envelope,
error codes). Drop the ADR framing (status line, summary/motivation),
the "Operations" section listing tools that have not landed yet, the
CLI section, and worked examples that called those tools, so the guide
matches what a developer can actually run today.
@kavya-chennoju
Copy link
Copy Markdown
Collaborator

Case sensitivity in selector matching

KeyFilter.matches (and Filter.matches for the bare-string axis) uses fnmatch.fnmatchcase, which is case-sensitive. Concretely:

discover("device(category:Camera)")     # 0 matches
discover("device(category:camera)")     # N matches

Users coming from Kubernetes labels, AWS tags, etc. won't expect this. Two reasonable resolutions:

  1. Document it. Add a callout to docs/discovery.md and the discover() docstring stating that selector matching is case-sensitive, and recommend lowercase label keys/values as the convention.

  2. Switch to case-insensitive. Lower-case both sides at compare time (fnmatch.fnmatchcase(actual.lower(), pattern.lower())). Cheap, removes the footgun, no impact on existing test fixtures since they use lowercase already.

I'd lean (2) — labels are metadata, not identifiers; case-sensitive matching is rarely what callers want. Happy to put up the patch if useful.

Minor: also worth aligning the glob-detection check ("*" in pattern or "?" in pattern) with the matcher's actual fnmatch capability — fnmatch supports [abc] ranges too, so currently category:[abc]rgb would be treated as a literal-string match rather than a glob.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants