Skip to content

Integrate Specification based Filtering into TraceEvent Pipeline #5

@rian-be

Description

@rian-be

Problem

The current ChangeTrace version provides rich set of Specification classes and queries for filtering TraceEvent streams, including:

  • Temporal specifications (TemporalQueries) – work hours, late-night commits, release bursts
  • Commit-related specifications (CommitQueries, CommitRelationshipQueries) – merge commits, PR-associated merges, parent/child relationships, author filters
  • Branch and actor specifications (ByBranchSpec, ByActorSpec)
  • Logical combinators (AndSpecification, OrSpecification, NotSpecification)

While these specifications allow precise and reusable filtering, the current event pipeline does not fully leverage them within aggregators or semantic event derivation. As a result:

  • Aggregators process the full TraceEvent stream without declarative filtering
  • Deterministic, composable event selection before semantic transformation is limited
  • Future analytics and rendering pipelines cannot easily reuse existing specification logic

Goal

Integrate Specification-based filtering into the event processing pipeline to:

  • Make aggregators streaming-friendly by filtering events at the source
  • Support deterministic, composable filtering prior to semantic aggregation
  • Enable advanced queries in downstream analytics or rendering stages

The resulting pipeline will allow:

TraceEvent Stream
      │
      ▼
Specification-based Filtering
      │
      ▼
EventAggregationEngine
      │
      ▼
SemanticEvent Stream

Proposed Tasks

To integrate specification based filtering effectively, consider the following steps:

  • Extend the EventAggregationEngine to optionally accept Specification<TraceEvent> filters for dynamic, pre-aggregation event selection.
  • Adjust existing aggregators (e.g., CommitBundlingAggregator, PRLifecycleAggregator) to leverage specifications where appropriate, ensuring filtering occurs at the earliest stage possible.
  • Make temporal, author, branch, and commit-type specifications easily combinable for complex scenarios, using logical combinators (AND, OR, NOT).
  • Provide utility methods or helpers to simplify creating composite specifications for common patterns.
  • Update documentation with practical examples showing how filtered streams feed into semantic aggregation and downstream analytics.

Benefits

  • Reusable, declarative filtering for all aggregators
  • Centralized, deterministic event selection before semantic derivation
  • Cleaner separation of concerns: filtering vs semantic transformation
  • Easier implementation of new analytics features based on filtered streams

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions