## Motivation
PHP FFE exposure delivery needs a native path with a cache that persists beyond a single PHP request/thread. The shared design doc is the cross-PR reference: https://docs.google.com/document/d/1NvMfTpZWLBlFmEFNjdnlMyeVpy5l7KD8qujGFco6w2w/edit?tab=t.0
This PR is exposure-only. Metrics were split into #2052 so reviewers can evaluate exposure cache and delivery separately from OTLP evaluation metrics.
## Changes
This adds caller-driven FFE exposure sidecar actions, exposure payload forwarding through the Agent EVP proxy, and a shared exposure cache that deduplicates repeated `(service, env, version, flag, subject)` assignments across PHP requests and sidecar connections.
The reusable FFE-domain pieces now live in `datadog-ffe` behind the `exposure-events` feature: exposure input types, the LRU deduplication cache, and JSON payload encoding. `datadog-sidecar` keeps only sidecar-specific work: deriving the agent EVP endpoint, building the HTTP request, applying the timeout, logging delivery failures, and integrating with sidecar lifecycle/actions.
Current PHP MVP path:
```mermaid
flowchart LR
Eval["PHP native evaluation<br/>ddog_ffe_evaluate"]
Batch["PHP tracer native memory<br/>request/thread-local exposure batch"]
Shutdown["PHP RSHUTDOWN<br/>flush exposure batch"]
Action["sidecar action<br/>record FFE exposures"]
Domain["datadog-ffe<br/>feature: exposure-events<br/>types + cache + JSON encoder"]
Sidecar["shared sidecar<br/>cross-request and cross-thread exposure cache"]
Agent["Datadog Agent<br/>EVP proxy"]
Intake["FFE exposure intake"]
Eval -->|"doLog=true assignment"| Batch
Batch --> Shutdown
Shutdown --> Action
Action --> Domain
Domain --> Sidecar
Sidecar --> Agent
Agent --> Intake
```
Future Python/Ruby connection:
```mermaid
flowchart LR
PyToday["dd-trace-py today<br/>host-language exposure writer"]
RbToday["dd-trace-rb today<br/>host-language exposure writer"]
PyFuture["dd-trace-py future<br/>explicit native opt-in"]
RbFuture["dd-trace-rb future<br/>explicit native opt-in"]
Native["libdatadog caller-driven<br/>FFE exposure action"]
Shared["shared sidecar<br/>dedupe + EVP delivery"]
Agent["Datadog Agent<br/>EVP proxy"]
PyToday -. "current direct EVP path" .-> Agent
RbToday -. "current direct EVP path" .-> Agent
PyFuture -. "after ownership switch" .-> Native
RbFuture -. "after ownership switch" .-> Native
Native --> Shared
Shared --> Agent
```
The future Python/Ruby arrows are intentionally not active behavior in this PR. They show why the reusable code lives in `datadog-ffe` rather than directly in sidecar internals, while preserving today's host-language ownership.
Why Python/Ruby do not double count today:
- Python and Ruby use libdatadog for evaluation only; the evaluator returns assignment metadata and does not enqueue exposure telemetry as a side effect.
- This PR adds a separate caller-driven sidecar action. Exposure emission happens only when an SDK explicitly records exposure candidates into that action. PHP wires this in its companion PR; Python and Ruby do not.
- Python and Ruby therefore keep exactly their current host-language EVP exposure writers. They are not also sending exposure candidates through this native sidecar path.
- The sidecar cache only deduplicates exposure candidates that enter the native sidecar path. It cannot protect direct host-language EVP writers, so future Python/Ruby migration must switch ownership to native logging and disable/bypass the host exposure writer for the same evaluations.
Reference implementation check: dd-trace-java follows the same exposure semantics and user ergonomics. Java's `DDEvaluator` is SDK-owned evaluation code; after resolving an assignment, it checks allocation `doLog`, builds an exposure event with flag, variant, allocation, targeting key, and context, and dispatches it through `FeatureFlaggingGateway`. `ExposureWriterImpl` subscribes to those exposure events, queues them, deduplicates with an LRU exposure cache, serializes service/env/version context, and posts to the Agent EVP proxy. Application code only calls the OpenFeature provider; it does not call an exposure API.
PHP mirrors that canonical shape, with PHP-specific lifecycle mechanics: the dd-trace-php evaluation bridge records `doLog=true` exposure candidates internally, request shutdown flushes the batch, and this PR's sidecar path owns cross-request deduplication and EVP delivery. For future Python/Ruby migration, the same rule applies: wire native exposure recording inside the SDK-owned evaluation path, and turn off the existing host-language exposure writer for those evaluations.
## Decisions
No telemetry is emitted automatically from shared libdatadog evaluator calls. SDKs must explicitly enqueue FFE telemetry actions. This remains required for Python/Ruby coexistence because those SDKs currently log exposures and metrics in host-language code.
The sidecar cache deduplicates only exposure candidates sent through this native sidecar path; it cannot deduplicate direct host-language EVP writers.
Future Python/Ruby migration must be an ownership switch, not an additional writer. When those SDKs opt into this native exposure path, their host-language exposure writers must be disabled or bypassed for the same evaluations to avoid double counting.
## Validation
Current head (`8be471fbc`) local validation:
```sh
cd /Users/leo.romanovsky/go/src/github.com/DataDog/libdatadog-ffe-sidecar-exposures
cargo fmt --check
cargo test -p datadog-ffe --features exposure-events telemetry::exposures
cargo test -p datadog-sidecar ffe_exposure
cargo check -p datadog-ffe
cargo check -p datadog-sidecar-ffi
```
Results: datadog-ffe exposure tests passed (4 passed), sidecar exposure tests passed (6 passed), default datadog-ffe check passed, sidecar FFI check passed, fmt check passed with only the repo stable-rustfmt warnings.
Prior downstream PHP behavior validation before the reusable-crate refactor, from DataDog/dd-trace-php#3910 using this PR at `6d23848a`:
```text
ffe-dogfooding subject=php-3910-split-1779981442
php7_exposures=1 php8_exposures=1
php7_metrics=0 php8_metrics=0
```
System-tests downstream validation:
```sh
TEST_LIBRARY=php ./run.sh FEATURE_FLAGGING_AND_EXPERIMENTATION tests/ffe/test_exposures.py -vv
```
Result: 11 passed in 77.53 seconds.
Related PRs: DataDog/dd-trace-php#3906, DataDog/dd-trace-php#3910, #2052, DataDog/system-tests#7031.
Co-authored-by: leo.romanovsky <leo.romanovsky@datadoghq.com>
Motivation
PHP FFE evaluation metrics need a native path for aggregation, OTLP encoding, and delivery without building PHP OTLP writer/transport machinery. The shared design doc is the cross-PR reference: https://docs.google.com/document/d/1NvMfTpZWLBlFmEFNjdnlMyeVpy5l7KD8qujGFco6w2w/edit?tab=t.0
This PR is metric-only. Exposures remain in #2026 so reviewers can evaluate OTLP metric delivery independently from exposure cache semantics.
Changes
This adds caller-driven FFE evaluation metric sidecar actions and OTLP export for
feature_flag.evaluations.The reusable FFE-domain pieces now live in
datadog-ffebehind theevaluation-metricsfeature: evaluation metric input types, metric attribute normalization, aggregation by matching attribute sets, and OTLP/protobuf payload encoding.datadog-sidecarkeeps only sidecar-specific work: parsing the configured endpoint URL, building the HTTP request, applying the timeout, logging delivery failures, and integrating with sidecar lifecycle/actions.The PHP companion PR uses this from native/C code for raw
DDTrace\ffe_evaluatecalls and from a thin PHP OpenFeature adapter for final OpenFeature-aware results. PHP does not aggregate, encode, or transport OTLP payloads.Current PHP MVP path:
flowchart LR Eval["PHP evaluation<br/>raw API or OpenFeature adapter"] Record["PHP tracer native call<br/>record typed evaluation metric"] Action["sidecar action<br/>record FFE evaluation metrics"] Domain["datadog-ffe<br/>feature: evaluation-metrics<br/>attributes + aggregation + OTLP encoder"] Sidecar["shared sidecar<br/>metric flush lifecycle"] Collector["OTLP endpoint<br/>Agent or local collector"] Intake["feature_flag.evaluations"] Eval --> Record Record --> Action Action --> Domain Domain --> Sidecar Sidecar --> Collector Collector --> IntakeFuture Python/Ruby connection:
flowchart LR PyToday["dd-trace-py today<br/>OpenFeature hook + host metric writer"] RbToday["dd-trace-rb today<br/>OpenFeature hook + host metric writer"] PyFuture["dd-trace-py future<br/>explicit native opt-in"] RbFuture["dd-trace-rb future<br/>explicit native opt-in"] Native["libdatadog caller-driven<br/>FFE metric action"] Shared["shared sidecar<br/>aggregation + OTLP delivery"] Otlp["OTLP endpoint"] PyToday -. "current host metric path" .-> Otlp RbToday -. "current host metric path" .-> Otlp PyFuture -. "after ownership switch" .-> Native RbFuture -. "after ownership switch" .-> Native Native --> Shared Shared --> OtlpThe future Python/Ruby arrows are intentionally not active behavior in this PR. They show the reusable target for a later migration while preserving today's host-language metric writers.
Why Python/Ruby do not double count today:
feature_flag.evaluationsas a side effect.Reference implementation check: dd-trace-java's canonical metric path is OpenFeature hook based. Java's
ProvidercreatesFlagEvalMetricsand returns aFlagEvalHook; the hook runs infinallyAfter, reads the final OpenFeatureFlagEvaluationDetailsincluding flag key, variant, reason, error code, and allocation metadata, and records onefeature_flag.evaluationscounter. Application code only calls OpenFeature; it does not call a metric API.PHP mirrors that canonical OpenFeature shape. The PHP OpenFeature provider disables raw native metric recording while it asks the native evaluator for an assignment, then records exactly one final OpenFeature-aware metric through the Datadog-owned recorder. The raw Datadog PHP client has no direct Java equivalent, but it keeps the same SDK-owned ergonomics: normal evaluation APIs record one native metric per evaluation internally. For future Python/Ruby migration, the same rule applies: either keep the existing host-language OpenFeature metric hook, or switch ownership to the native recorder and disable/bypass the host metric writer for those evaluations.
Decisions
No telemetry is emitted automatically from shared libdatadog evaluator calls. SDKs must explicitly enqueue FFE telemetry actions. This avoids double counting for Python/Ruby, which currently log feature-flag telemetry in host-language code.
Evaluation metrics intentionally count evaluations and do not use exposure-cache deduplication semantics.
Future Python/Ruby migration must be an ownership switch, not an additional writer. If those SDKs opt into this native metric path, their host-language OpenFeature metric writers must stop recording the same evaluations.
Validation
Current head (
96d9a7bae) local validation:Results: datadog-ffe metric tests passed (2 passed), sidecar metric tests passed (6 passed), default datadog-ffe check passed, sidecar FFI check passed, fmt check passed with only the repo stable-rustfmt warnings.
Prior downstream PHP behavior validation before the reusable-crate refactor, from DataDog/dd-trace-php#3911 using this PR at
1f1fca439:System-tests downstream validation:
Result: 17 passed in 81.26 seconds.
Related PRs: DataDog/dd-trace-php#3906, DataDog/dd-trace-php#3911, #2026, DataDog/system-tests#7033.