Skip to content

feat(sidecar): forward FFE exposures to EVP proxy#2026

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 33 commits into
mainfrom
leo.romanovsky/ffe-sidecar-exposures
Jun 1, 2026
Merged

feat(sidecar): forward FFE exposures to EVP proxy#2026
gh-worker-dd-mergequeue-cf854d[bot] merged 33 commits into
mainfrom
leo.romanovsky/ffe-sidecar-exposures

Conversation

@leoromanovsky
Copy link
Copy Markdown
Contributor

@leoromanovsky leoromanovsky commented May 22, 2026

Motivation

PHP FFE exposure delivery needs a native path with a cache that persists beyond a single PHP request/thread. The shared design doc is the cross-PR reference: https://docs.google.com/document/d/1NvMfTpZWLBlFmEFNjdnlMyeVpy5l7KD8qujGFco6w2w/edit?tab=t.0

This PR is exposure-only. Metrics were split into #2052 so reviewers can evaluate exposure cache and delivery separately from OTLP evaluation metrics.

Changes

This adds caller-driven FFE exposure sidecar actions, exposure payload forwarding through the Agent EVP proxy, and a shared exposure cache that deduplicates repeated (service, env, version, flag, subject) assignments across PHP requests and sidecar connections.

The reusable FFE-domain pieces now live in datadog-ffe behind the exposure-events feature: exposure input types, the LRU deduplication cache, and JSON payload encoding. datadog-sidecar keeps only sidecar-specific work: deriving the agent EVP endpoint, building the HTTP request, applying the timeout, logging delivery failures, and integrating with sidecar lifecycle/actions.

Current PHP MVP path:

flowchart LR
    Eval["PHP native evaluation<br/>ddog_ffe_evaluate"]
    Batch["PHP tracer native memory<br/>request/thread-local exposure batch"]
    Shutdown["PHP RSHUTDOWN<br/>flush exposure batch"]
    Action["sidecar action<br/>record FFE exposures"]
    Domain["datadog-ffe<br/>feature: exposure-events<br/>types + cache + JSON encoder"]
    Sidecar["shared sidecar<br/>cross-request and cross-thread exposure cache"]
    Agent["Datadog Agent<br/>EVP proxy"]
    Intake["FFE exposure intake"]

    Eval -->|"doLog=true assignment"| Batch
    Batch --> Shutdown
    Shutdown --> Action
    Action --> Domain
    Domain --> Sidecar
    Sidecar --> Agent
    Agent --> Intake
Loading

Future Python/Ruby connection:

flowchart LR
    PyToday["dd-trace-py today<br/>host-language exposure writer"]
    RbToday["dd-trace-rb today<br/>host-language exposure writer"]
    PyFuture["dd-trace-py future<br/>explicit native opt-in"]
    RbFuture["dd-trace-rb future<br/>explicit native opt-in"]
    Native["libdatadog caller-driven<br/>FFE exposure action"]
    Shared["shared sidecar<br/>dedupe + EVP delivery"]
    Agent["Datadog Agent<br/>EVP proxy"]

    PyToday -. "current direct EVP path" .-> Agent
    RbToday -. "current direct EVP path" .-> Agent
    PyFuture -. "after ownership switch" .-> Native
    RbFuture -. "after ownership switch" .-> Native
    Native --> Shared
    Shared --> Agent
Loading

The future Python/Ruby arrows are intentionally not active behavior in this PR. They show why the reusable code lives in datadog-ffe rather than directly in sidecar internals, while preserving today's host-language ownership.

Why Python/Ruby do not double count today:

  • Python and Ruby use libdatadog for evaluation only; the evaluator returns assignment metadata and does not enqueue exposure telemetry as a side effect.
  • This PR adds a separate caller-driven sidecar action. Exposure emission happens only when an SDK explicitly records exposure candidates into that action. PHP wires this in its companion PR; Python and Ruby do not.
  • Python and Ruby therefore keep exactly their current host-language EVP exposure writers. They are not also sending exposure candidates through this native sidecar path.
  • The sidecar cache only deduplicates exposure candidates that enter the native sidecar path. It cannot protect direct host-language EVP writers, so future Python/Ruby migration must switch ownership to native logging and disable/bypass the host exposure writer for the same evaluations.

Reference implementation check: dd-trace-java follows the same exposure semantics and user ergonomics. Java's DDEvaluator is SDK-owned evaluation code; after resolving an assignment, it checks allocation doLog, builds an exposure event with flag, variant, allocation, targeting key, and context, and dispatches it through FeatureFlaggingGateway. ExposureWriterImpl subscribes to those exposure events, queues them, deduplicates with an LRU exposure cache, serializes service/env/version context, and posts to the Agent EVP proxy. Application code only calls the OpenFeature provider; it does not call an exposure API.

PHP mirrors that canonical shape, with PHP-specific lifecycle mechanics: the dd-trace-php evaluation bridge records doLog=true exposure candidates internally, request shutdown flushes the batch, and this PR's sidecar path owns cross-request deduplication and EVP delivery. For future Python/Ruby migration, the same rule applies: wire native exposure recording inside the SDK-owned evaluation path, and turn off the existing host-language exposure writer for those evaluations.

Decisions

No telemetry is emitted automatically from shared libdatadog evaluator calls. SDKs must explicitly enqueue FFE telemetry actions. This remains required for Python/Ruby coexistence because those SDKs currently log exposures and metrics in host-language code.

The sidecar cache deduplicates only exposure candidates sent through this native sidecar path; it cannot deduplicate direct host-language EVP writers.

Future Python/Ruby migration must be an ownership switch, not an additional writer. When those SDKs opt into this native exposure path, their host-language exposure writers must be disabled or bypassed for the same evaluations to avoid double counting.

Validation

Current head (8be471fbc) local validation:

cd /Users/leo.romanovsky/go/src/github.com/DataDog/libdatadog-ffe-sidecar-exposures
cargo fmt --check
cargo test -p datadog-ffe --features exposure-events telemetry::exposures
cargo test -p datadog-sidecar ffe_exposure
cargo check -p datadog-ffe
cargo check -p datadog-sidecar-ffi

Results: datadog-ffe exposure tests passed (4 passed), sidecar exposure tests passed (6 passed), default datadog-ffe check passed, sidecar FFI check passed, fmt check passed with only the repo stable-rustfmt warnings.

Prior downstream PHP behavior validation before the reusable-crate refactor, from DataDog/dd-trace-php#3910 using this PR at 6d23848a:

ffe-dogfooding subject=php-3910-split-1779981442
php7_exposures=1 php8_exposures=1
php7_metrics=0 php8_metrics=0

System-tests downstream validation:

TEST_LIBRARY=php ./run.sh FEATURE_FLAGGING_AND_EXPERIMENTATION tests/ffe/test_exposures.py -vv

Result: 11 passed in 77.53 seconds.

Related PRs: DataDog/dd-trace-php#3906, DataDog/dd-trace-php#3910, #2052, DataDog/system-tests#7031.

Adds SidecarAction::FfeExposures variant so the PHP tracer can hand a
batched exposure payload to the sidecar, and adds an ffe_flusher module
that POSTs the payload to the agent's EVP proxy at
/evp_proxy/v2/api/v2/exposures with X-Datadog-EVP-Subdomain:
event-platform-intake. Matches dd-trace-go / ruby / python / js /
dotnet wire protocol. Fire-and-forget; non-2xx is logged and dropped
(no agent_info gating, consistent with other tracers).

Also exposes ddog_sidecar_send_ffe_exposures FFI in datadog-sidecar-ffi
for the PHP extension to call from its RSHUTDOWN / MSHUTDOWN hooks.

Tests: 3 httpmock-backed cases cover POST method + path + subdomain
header + body, non-2xx drop, and endpoint-path override while
preserving authority / scheme / auth / timeout.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 22, 2026

Clippy Allow Annotation Report

Comparing clippy allow annotations between branches:

  • Base Branch: origin/main
  • PR Branch: origin/leo.romanovsky/ffe-sidecar-exposures

Summary by Rule

Rule Base Branch PR Branch Change
expect_used 2 2 No change (0%)
unwrap_used 7 7 No change (0%)
Total 9 9 No change (0%)

Annotation Counts by File

File Base Branch PR Branch Change
datadog-sidecar/src/service/sidecar_server.rs 6 6 No change (0%)
datadog-sidecar/src/service/telemetry.rs 3 3 No change (0%)

Annotation Stats by Crate

Crate Base Branch PR Branch Change
clippy-annotation-reporter 5 5 No change (0%)
datadog-ffe-ffi 1 1 No change (0%)
datadog-ipc 21 21 No change (0%)
datadog-live-debugger 6 6 No change (0%)
datadog-live-debugger-ffi 10 10 No change (0%)
datadog-profiling-replayer 4 4 No change (0%)
datadog-remote-config 3 3 No change (0%)
datadog-sidecar 57 57 No change (0%)
libdd-common 13 13 No change (0%)
libdd-common-ffi 12 12 No change (0%)
libdd-data-pipeline 5 5 No change (0%)
libdd-ddsketch 2 2 No change (0%)
libdd-dogstatsd-client 1 1 No change (0%)
libdd-profiling 13 13 No change (0%)
libdd-telemetry 20 20 No change (0%)
libdd-tinybytes 4 4 No change (0%)
libdd-trace-normalization 2 2 No change (0%)
libdd-trace-obfuscation 3 3 No change (0%)
libdd-trace-stats 1 1 No change (0%)
libdd-trace-utils 13 13 No change (0%)
Total 196 196 No change (0%)

About This Report

This report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality.

@datadog-official
Copy link
Copy Markdown

datadog-official Bot commented May 22, 2026

Tests

🎉 All green!

🧪 All tests passed
❄️ No new flaky tests detected

🎯 Code Coverage (details)
Patch Coverage: 83.11%
Overall Coverage: 73.31% (+0.10%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 7c51078 | Docs | Datadog PR Page | Give us feedback!

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 22, 2026

Codecov Report

❌ Patch coverage is 83.11445% with 90 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.31%. Comparing base (1fe5944) to head (7c51078).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2026      +/-   ##
==========================================
+ Coverage   73.21%   73.31%   +0.09%     
==========================================
  Files         462      464       +2     
  Lines       77218    77748     +530     
==========================================
+ Hits        56536    56998     +462     
- Misses      20682    20750      +68     
Components Coverage Δ
libdd-crashtracker 65.47% <ø> (+0.01%) ⬆️
libdd-crashtracker-ffi 37.68% <ø> (ø)
libdd-alloc 98.77% <ø> (ø)
libdd-data-pipeline 85.84% <ø> (ø)
libdd-data-pipeline-ffi 77.03% <ø> (ø)
libdd-common 79.89% <ø> (ø)
libdd-common-ffi 74.41% <ø> (ø)
libdd-telemetry 73.34% <ø> (-0.03%) ⬇️
libdd-telemetry-ffi 31.36% <ø> (ø)
libdd-dogstatsd-client 82.64% <ø> (ø)
datadog-ipc 76.22% <ø> (+1.46%) ⬆️
libdd-profiling 81.68% <ø> (-0.02%) ⬇️
libdd-profiling-ffi 64.79% <ø> (ø)
libdd-sampling 97.41% <ø> (ø)
datadog-sidecar 34.59% <74.46%> (+1.74%) ⬆️
datdog-sidecar-ffi 8.61% <0.00%> (-0.88%) ⬇️
spawn-worker 48.86% <ø> (ø)
libdd-tinybytes 93.80% <ø> (ø)
libdd-trace-normalization 81.71% <ø> (ø)
libdd-trace-obfuscation 87.30% <ø> (ø)
libdd-trace-protobuf 68.25% <ø> (ø)
libdd-trace-utils 89.17% <ø> (ø)
libdd-tracer-flare 86.88% <ø> (ø)
libdd-log 74.83% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Adds a parallel pathway for PHP feature-flag evaluation metrics
mirroring the FfeExposures forwarder. dd-trace-php encodes
`feature_flag.evaluations` counters as OTLP/protobuf in PHP
(via its existing PHP 7-safe `OtlpMetricEncoder`) and ships the
encoded bytes to the sidecar, which POSTs them to the user-configured
OTLP HTTP metrics intake.

Why a sibling action instead of reusing FfeExposures:

- The OTLP collector is not the Datadog Agent. It's user-configurable
  via OTEL_EXPORTER_OTLP_METRICS_ENDPOINT (default
  http://localhost:4318/v1/metrics), so the endpoint travels with the
  payload rather than being derived from the sidecar session's agent
  base URL.
- Content type differs (application/x-protobuf vs application/json).
- No EVP subdomain header.
- The payload is binary protobuf, not a JSON string.

dd-trace-php side (PR DataDog/dd-trace-php#3911) will refactor its
existing `OtlpHttpMetricTransport` (which currently does PHP-side
HTTP I/O, violating the architectural rule "no I/O outside the
sidecar") to call this new FFI.

Validation:

- `cargo test -p datadog-sidecar ffe` passes 7 tests
  (3 exposures + 4 metrics).
- `cargo check -p datadog-sidecar-ffi` clean.
leoromanovsky added a commit to DataDog/dd-trace-php that referenced this pull request May 23, 2026
Adds Mermaid sources and rendered PNGs for the hook (this) PR plus a
README documenting the regeneration workflow.

- `docs/php-ffe-stack/stack-pr3909.mmd` + `.png` — 4-PR stack with this
  PR highlighted (M1 done; EVP and metrics as siblings to come).
- `docs/php-ffe-stack/system-pr3909.mmd` + `.png` — target system
  architecture; this PR contributes the EvaluationCompletedHook +
  OpenFeature provider hook surface. All downstream nodes (writers,
  sidecar FFI, sidecar process, backends) marked future.
- `docs/php-ffe-stack/README.md` — npx invocation for regenerating
  PNGs locally; PR-by-PR diagram table; architectural rule note.

The architectural rule encoded in the system diagram (all I/O via the
libdatadog sidecar) is the same rule Bob applied to PR #3910. See
DataDog/libdatadog#2026 for the sidecar-side support.
leoromanovsky added a commit to DataDog/dd-trace-php that referenced this pull request May 23, 2026
Per Bob's PR review (2026-05-22), the tracer extension must perform no
I/O outside the sidecar. Replaces the raw-socket `AgentExposureTransport`
with `SidecarExposureTransport`, which forwards exposure batches to the
libdatadog sidecar via a new native PHP function `\DDTrace\send_ffe_exposures`
that calls the `ddog_sidecar_send_ffe_exposures` FFI added in
DataDog/libdatadog#2026.

PHP side:

- Delete `Internal/Exposure/AgentExposureTransport.php` (raw socket
  POST to the Agent EVP proxy).
- Add `Internal/Exposure/SidecarExposureTransport.php` that JSON-encodes
  the batch and calls `\DDTrace\send_ffe_exposures()`. Fire-and-forget;
  the sidecar handles retries.
- Update `ExposureWriter::createDefault()` to instantiate the sidecar
  transport.
- Drop the obsolete `testAgentTransportBuildsAgentEvpRequest` PHPUnit
  test (HTTP construction now lives in libdatadog, covered by
  `cargo test -p datadog-sidecar ffe_flusher`).
- Add `Internal/DefaultEvaluationCompletedHook` and
  `Internal/CompositeEvaluationCompletedHook` so production callers go
  through a composite hook factory. In this PR the composite contains
  only `ExposureHook`; the metrics PR (#3911) contributes
  `EvaluationMetricHook` and the file conflict at merge resolves by
  combining both. Update `Client::create()` to call
  `DefaultEvaluationCompletedHook::create()`.

C/Rust bridge:

- Declare `ddog_ByteSlice` (and underlying `ddog_Slice_U8`) in
  `components-rs/common.h` for the metrics path; declare both
  `ddog_sidecar_send_ffe_exposures` and `ddog_sidecar_send_ffe_metrics`
  in `components-rs/sidecar.h`.
- Add C wrappers `ddtrace_sidecar_send_ffe_exposures(zend_string *)`
  and `ddtrace_sidecar_send_ffe_metrics(zend_string *endpoint,
  zend_string *payload_bytes)` in `ext/sidecar.{h,c}` that call the FFI
  with the current sidecar transport + instance id + queue id.
- Declare native PHP functions `\DDTrace\send_ffe_exposures(string): bool`
  and `\DDTrace\send_ffe_metrics(string, string): bool` in
  `ext/ddtrace.stub.php`; add corresponding arginfo entries and
  `ZEND_FUNCTION` registrations in `ext/ddtrace_arginfo.h`; implement
  `PHP_FUNCTION(DDTrace_send_ffe_exposures)` and
  `PHP_FUNCTION(DDTrace_send_ffe_metrics)` in `ext/ddtrace.c`.
- Bump `libdatadog` submodule to FFE branch tip `29762335c` (which
  provides both FFIs). The submodule will be bumped to the libdatadog
  main commit once #2026 merges.

Docs:

- Add `docs/php-ffe-stack/{stack,system}-pr3910.{mmd,png}` for this PR.

Validation:

- `php vendor/bin/phpunit --config phpunit.xml tests/api/Unit/FeatureFlags`
  → 41 tests, 174 assertions, OK.
- libdatadog sidecar tests (`cargo test -p datadog-sidecar ffe_flusher`)
  → 3 passed, on the pinned submodule commit.
- Mermaid PNGs regenerate via `npx @mermaid-js/mermaid-cli`.

`make test_featureflags` and `make test_c TESTS=tests/ext/ffe/...` will
run in CI; running them locally requires rebuilding the extension which
is gated behind libdatadog #2026 merging.
leoromanovsky added a commit to DataDog/dd-trace-php that referenced this pull request May 23, 2026
Adds the M3 evaluation-metrics layer on top of the hook PR (#3909) as a
sibling of the EVP exposures PR (#3910). Records `feature_flag.evaluations`
for both PHP 7 (DD Client hook) and PHP 8 (OpenFeature SDK hook); both
paths share `EvaluationMetricHook::sharedWriter()` for unified
aggregation. OTLP/protobuf payloads are encoded in PHP via the existing
`OtlpMetricEncoder` and delivered to the user-configured OTLP HTTP
metrics intake through the libdatadog sidecar (`ddog_sidecar_send_ffe_metrics`
FFI added in DataDog/libdatadog#2026).

This branch is force-pushed (user-authorized one-time exception to the
no-force-push rule, 2026-05-23) to restructure history away from being
linearly stacked on the M2 exposures PR (#3910). The PR now stacks
directly on the hook PR (#3909) as a sibling of the EVP PR.

PHP side:

- Add `Internal/Metric/EvaluationMetricWriter` with bounded series
  aggregation, drop accounting, and shutdown flush.
- Add `Internal/Metric/EvaluationMetricHook` (DD Client hook) and
  `OtlpMetricEncoder` (PHP 7-safe protobuf encoding).
- Add `Internal/Metric/SidecarOtlpMetricsTransport` that calls
  `\DDTrace\send_ffe_metrics()` (FFI declared in #3910). Endpoint
  resolution: `OTEL_EXPORTER_OTLP_METRICS_ENDPOINT`, falling back to
  `OTEL_EXPORTER_OTLP_ENDPOINT + /v1/metrics`, default
  `http://localhost:4318/v1/metrics`.
- Add `DDTrace\OpenFeature\EvalMetricsHook` implementing
  `OpenFeature\interfaces\hooks\Hook` (after + error stages), registered
  on `DataDogProvider` via `setHooks()`.
- `DataDogProvider` constructs its internal DD `Client` with
  `DefaultEvaluationCompletedHook::createWithoutMetric()` so the
  OpenFeature path records the metric via the OpenFeature hook (PR 3911
  scope) and NOT via the DD Client hook — preventing double-counting.
  PHP 7 path keeps recording via the DD Client hook.
- Add `Internal/CompositeEvaluationCompletedHook` and
  `Internal/DefaultEvaluationCompletedHook` (metric-only composite).
  This is the merge-conflict point with PR #3910's `[ExposureHook]`
  composite — second merge resolves by combining both hooks.
- Update `Client::create()` to call `DefaultEvaluationCompletedHook::create()`.
- Drop the obsolete `testOtlpTransportBuildsHttpProtobufRequest` PHPUnit
  test (HTTP construction now lives in libdatadog, covered by
  `cargo test -p datadog-sidecar ffe_metrics_flusher`).
- Add `_files_openfeature.php` entry for `EvalMetricsHook.php`.

C/Rust bridge: the `\DDTrace\send_ffe_metrics()` native function, its C
wrapper `ddtrace_sidecar_send_ffe_metrics()`, and the
`ddog_sidecar_send_ffe_metrics` FFI declaration in `components-rs/sidecar.h`
were already added in #3910. This PR's branch picks up those changes
once #3910 merges (or via the same libdatadog submodule pin during
review). For development locally the libdatadog submodule is pinned to
the FFE branch tip (`29762335c`).

Docs:

- Add `docs/php-ffe-stack/{stack,system}-pr3911.{mmd,png}` per the
  4-PR documentation convention.

Validation:

- `php vendor/bin/phpunit --config phpunit.xml tests/api/Unit/FeatureFlags`
  → 40 tests, 160 assertions, OK.
- Mermaid PNGs regenerate via `npx @mermaid-js/mermaid-cli`.

`make test_featureflags`, OpenFeature PHPUnit, and ffe-dogfooding
end-to-end validation will run in CI / are validated separately by
FOLLOW-05 Steps 4–5.
The PHP FFE writers (`SidecarExposureTransport`,
`SidecarOtlpMetricsTransport`) can fire as soon as evaluations begin —
which is often earlier than the first remote-config metadata call that
registers the application against a `QueueId`.

Previously, FFE dispatch lived inside the
`if let Entry::Occupied(entry) = applications.entry(queue_id) { ... }`
block in `enqueue_actions`. That block is only entered after the PHP
runtime has called `set_remote_config_data` or `set_request_config` for
this queue. For shorter-lived PHP processes (parametric test client,
CLI tools, eager evaluators) the FFE batch arrives before the app
registration call lands, so the entire batch was silently dropped.

This change filters `FfeExposures` and `FfeMetrics` actions out of
the action vec before the application-entry gate and dispatches them
directly: both only need session-level state (the trace endpoint /
the user-supplied OTLP endpoint), not per-application telemetry context.

Validated locally with dd-trace-php system-tests parametric
`Test_Feature_Flag_Parametric_Evaluation_Metrics::test_php_ffe_evaluation_metric`,
which now passes (26/27 FFE-scoped tests; remaining failure is the
exposure_event test on a branch that lacks the exposure code path).
Pair the EVP-exposure forwarder name with its sibling `ffe_metrics_flusher`.
The unqualified `ffe_flusher` predates the OTLP-metrics forwarder and the
asymmetry was leaving readers wondering whether `ffe_flusher` was a
parent/umbrella module or a sibling.

Renames the file via `git mv` (preserving blame history) and updates all
references (mod.rs, sidecar_server.rs dispatch arm, ffe_metrics_flusher.rs
cross-reference in the module doc, and the CODEOWNERS entry).

No functional change.
The renamed identifier pushed one debug! line past rustfmt's column
limit. Apply `cargo fmt -p datadog-sidecar -p datadog-sidecar-ffi` to
break the macro across three lines, matching CI's nightly-2026-02-08
rustfmt.
Single architecture diagram showing the end-to-end FFE delivery path
through the sidecar:

  tracer payload → ddog_sidecar_send_ffe_{exposures,metrics} FFI
                 → tarpc enqueue_actions IPC
                 → sidecar_server.rs enqueue_actions handler
                 → FFE filter (lifted out of applications.entry gate, this PR)
                 → ffe_exposures_flusher / ffe_metrics_flusher
                 → NativeCapabilities HTTP client
                 → Agent EVP proxy / OTLP HTTP intake

Uses `flowchart TD` and a quoted YAML title (Mermaid's frontmatter
parser eats unquoted `#` as comments). PNG rendered at 2400×2400
`--scale 3 -b white` for legible PR-page thumbnails.
@leoromanovsky leoromanovsky marked this pull request as ready for review May 24, 2026 13:05
@leoromanovsky leoromanovsky requested review from a team as code owners May 24, 2026 13:05
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 12 changed files in this pull request and generated no new comments.

Comment thread datadog-ffe/src/telemetry/exposures.rs Outdated
Comment on lines +56 to +62
let key = ExposureCacheKey {
service: context.service.clone(),
env: context.env.clone(),
version: context.version.clone(),
flag_key: exposure.flag_key.clone(),
subject_id: exposure.subject_id.clone(),
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Could compute a hash from the keys and check for membership in a "seen" memo if there is any performance concern about cloning strings. But that has its own tradeoffs (e.g. extra memory for the hash set) so this is fine too

Comment thread datadog-sidecar/src/service/sidecar_server.rs Outdated
Copy link
Copy Markdown
Contributor

@bwoebi bwoebi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good to me, except for a weird comment.

gh-worker-dd-mergequeue-cf854d Bot pushed a commit that referenced this pull request Jun 1, 2026
## Motivation

PHP FFE evaluation metrics need a native path for aggregation, OTLP encoding, and delivery without building PHP OTLP writer/transport machinery. The shared design doc is the cross-PR reference: https://docs.google.com/document/d/1NvMfTpZWLBlFmEFNjdnlMyeVpy5l7KD8qujGFco6w2w/edit?tab=t.0

This PR is metric-only. Exposures remain in #2026 so reviewers can evaluate OTLP metric delivery independently from exposure cache semantics.

## Changes

This adds caller-driven FFE evaluation metric sidecar actions and OTLP export for `feature_flag.evaluations`.

The reusable FFE-domain pieces now live in `datadog-ffe` behind the `evaluation-metrics` feature: evaluation metric input types, metric attribute normalization, aggregation by matching attribute sets, and OTLP/protobuf payload encoding. `datadog-sidecar` keeps only sidecar-specific work: parsing the configured endpoint URL, building the HTTP request, applying the timeout, logging delivery failures, and integrating with sidecar lifecycle/actions.

The PHP companion PR uses this from native/C code for raw `DDTrace\ffe_evaluate` calls and from a thin PHP OpenFeature adapter for final OpenFeature-aware results. PHP does not aggregate, encode, or transport OTLP payloads.

Current PHP MVP path:

```mermaid
flowchart LR
    Eval["PHP evaluation<br/>raw API or OpenFeature adapter"]
    Record["PHP tracer native call<br/>record typed evaluation metric"]
    Action["sidecar action<br/>record FFE evaluation metrics"]
    Domain["datadog-ffe<br/>feature: evaluation-metrics<br/>attributes + aggregation + OTLP encoder"]
    Sidecar["shared sidecar<br/>metric flush lifecycle"]
    Collector["OTLP endpoint<br/>Agent or local collector"]
    Intake["feature_flag.evaluations"]

    Eval --> Record
    Record --> Action
    Action --> Domain
    Domain --> Sidecar
    Sidecar --> Collector
    Collector --> Intake
```

Future Python/Ruby connection:

```mermaid
flowchart LR
    PyToday["dd-trace-py today<br/>OpenFeature hook + host metric writer"]
    RbToday["dd-trace-rb today<br/>OpenFeature hook + host metric writer"]
    PyFuture["dd-trace-py future<br/>explicit native opt-in"]
    RbFuture["dd-trace-rb future<br/>explicit native opt-in"]
    Native["libdatadog caller-driven<br/>FFE metric action"]
    Shared["shared sidecar<br/>aggregation + OTLP delivery"]
    Otlp["OTLP endpoint"]

    PyToday -. "current host metric path" .-> Otlp
    RbToday -. "current host metric path" .-> Otlp
    PyFuture -. "after ownership switch" .-> Native
    RbFuture -. "after ownership switch" .-> Native
    Native --> Shared
    Shared --> Otlp
```

The future Python/Ruby arrows are intentionally not active behavior in this PR. They show the reusable target for a later migration while preserving today's host-language metric writers.

Why Python/Ruby do not double count today:

- Python and Ruby use libdatadog for evaluation only; the evaluator returns assignment metadata and does not record `feature_flag.evaluations` as a side effect.
- This PR adds a separate caller-driven sidecar action. Metric emission happens only when an SDK explicitly records a typed evaluation metric into that action. PHP wires this in its companion PR; Python and Ruby do not.
- Python and Ruby therefore keep exactly their current host-language OpenFeature metric writers. They are not also sending evaluation metrics through this native sidecar path.
- Evaluation metrics intentionally count every evaluation and do not have exposure-cache deduplication semantics. Future Python/Ruby migration must switch ownership to native logging and disable/bypass the host metric writer for the same evaluations.

Reference implementation check: dd-trace-java's canonical metric path is OpenFeature hook based. Java's `Provider` creates `FlagEvalMetrics` and returns a `FlagEvalHook`; the hook runs in `finallyAfter`, reads the final OpenFeature `FlagEvaluationDetails` including flag key, variant, reason, error code, and allocation metadata, and records one `feature_flag.evaluations` counter. Application code only calls OpenFeature; it does not call a metric API.

PHP mirrors that canonical OpenFeature shape. The PHP OpenFeature provider disables raw native metric recording while it asks the native evaluator for an assignment, then records exactly one final OpenFeature-aware metric through the Datadog-owned recorder. The raw Datadog PHP client has no direct Java equivalent, but it keeps the same SDK-owned ergonomics: normal evaluation APIs record one native metric per evaluation internally. For future Python/Ruby migration, the same rule applies: either keep the existing host-language OpenFeature metric hook, or switch ownership to the native recorder and disable/bypass the host metric writer for those evaluations.


## Decisions

No telemetry is emitted automatically from shared libdatadog evaluator calls. SDKs must explicitly enqueue FFE telemetry actions. This avoids double counting for Python/Ruby, which currently log feature-flag telemetry in host-language code.

Evaluation metrics intentionally count evaluations and do not use exposure-cache deduplication semantics.

Future Python/Ruby migration must be an ownership switch, not an additional writer. If those SDKs opt into this native metric path, their host-language OpenFeature metric writers must stop recording the same evaluations.

## Validation

Current head (`96d9a7bae`) local validation:

```sh
cd /Users/leo.romanovsky/go/src/github.com/DataDog/libdatadog-ffe-sidecar-metrics
cargo fmt --check
cargo test -p datadog-ffe --features evaluation-metrics telemetry::evaluation_metrics
cargo test -p datadog-sidecar ffe_metric
cargo check -p datadog-ffe
cargo check -p datadog-sidecar-ffi
```

Results: datadog-ffe metric tests passed (2 passed), sidecar metric tests passed (6 passed), default datadog-ffe check passed, sidecar FFI check passed, fmt check passed with only the repo stable-rustfmt warnings.

Prior downstream PHP behavior validation before the reusable-crate refactor, from DataDog/dd-trace-php#3911 using this PR at `1f1fca439`:

```text
ffe-dogfooding subject=php-3911-split-1779981881
php7_metrics=3 php8_metrics=3
php7_exposures=0 php8_exposures=0
```

System-tests downstream validation:

```sh
TEST_LIBRARY=php ./run.sh FEATURE_FLAGGING_AND_EXPERIMENTATION tests/ffe/test_flag_eval_metrics.py -vv
```

Result: 17 passed in 81.26 seconds.

Related PRs: DataDog/dd-trace-php#3906, DataDog/dd-trace-php#3911, #2026, DataDog/system-tests#7033.



Co-authored-by: leo.romanovsky <leo.romanovsky@datadoghq.com>
…decar-exposures

# Conflicts:
#	.github/CODEOWNERS
#	datadog-ffe/Cargo.toml
#	datadog-ffe/src/lib.rs
#	datadog-ffe/src/telemetry/mod.rs
#	datadog-sidecar-ffi/src/lib.rs
#	datadog-sidecar/Cargo.toml
#	datadog-sidecar/src/service/mod.rs
#	datadog-sidecar/src/service/sidecar_server.rs
#	datadog-sidecar/src/service/telemetry.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants