Summary
We've hit two separate incidents where a feature flag toggle in the WorkOS dashboard
did not propagate to the expected set of users. In both cases the dashboard showed
the flag enabled, but the application behavior (driven by the feature_flags JWT
claim issued by AuthKit) reflected the previous targeting rule.
Filing here because the JWT claim is what AuthKit issues and what our app reads —
happy to be redirected to a better channel if this is a server-side concern.
How we read feature flags
We use @workos-inc/authkit-nextjs on the frontend and read flags from the JWT's
feature_flags claim in our Hono backend middleware:
c.set('featureFlags', payload.feature_flags ?? []);
Downstream checks are a simple featureFlags.includes('snippet-synthesis').
We do not call the Feature Flags API at request time — the JWT is the carrier.
Incident 2 (most recent)
- Flag:
snippet-synthesis
- Flag was targeted at our organization only. Behavior was correct: our org saw
the feature, other orgs did not.
- We changed targeting from "our organization only" to "all organizations" in the
dashboard.
- After the change, the feature stopped working for our organization while
other organizations started receiving it as expected.
Expected: broadening targeting from "Org X" to "all organizations" is a superset
— Org X should never lose the flag.
Actual: Org X lost the flag while everyone else gained it.
Incident 1 (earlier — different flag)
- Flag was enabled in production for a single user → feature visible ✅
- Same flag enabled in staging for everyone → feature not visible ❌
Same shape of problem: dashboard state and the runtime feature_flags claim
disagreed.
What we'd like help understanding
-
Are there known propagation or caching delays between a dashboard toggle and
the values that appear in the feature_flags JWT claim on subsequent token
issuance / refresh?
-
When a flag's targeting rule is widened (e.g. "specific org" → "all orgs"),
is it possible for orgs covered by the original rule to transiently drop out
of the new rule's evaluation set?
-
Is the feature_flags JWT claim evaluated:
- at token issuance (so existing sessions retain stale values until
refresh / sign-in), or
- at token refresh against the current rule set?
Either is defensible; we'd just like to confirm so we can document expected
user impact when we toggle flags.
-
Is there an API or dashboard view that shows the effective evaluation for
a given (user, organization) pair so we can reproduce discrepancies
ourselves?
What we can share privately
Happy to share via support / a private channel:
- WorkOS environment ID
- WorkOS organization ID
- A specific WorkOS user ID that observed the failure
- Approximate UTC timestamps of the dashboard change and the failed requests
- Request IDs /
x-workos-request-id headers from token issuance around that time
Let us know what would be most useful and the right place to send it.
Summary
We've hit two separate incidents where a feature flag toggle in the WorkOS dashboard
did not propagate to the expected set of users. In both cases the dashboard showed
the flag enabled, but the application behavior (driven by the
feature_flagsJWTclaim issued by AuthKit) reflected the previous targeting rule.
Filing here because the JWT claim is what AuthKit issues and what our app reads —
happy to be redirected to a better channel if this is a server-side concern.
How we read feature flags
We use
@workos-inc/authkit-nextjson the frontend and read flags from the JWT'sfeature_flagsclaim in our Hono backend middleware:Downstream checks are a simple
featureFlags.includes('snippet-synthesis').We do not call the Feature Flags API at request time — the JWT is the carrier.
Incident 2 (most recent)
snippet-synthesisthe feature, other orgs did not.
dashboard.
other organizations started receiving it as expected.
Expected: broadening targeting from "Org X" to "all organizations" is a superset
— Org X should never lose the flag.
Actual: Org X lost the flag while everyone else gained it.
Incident 1 (earlier — different flag)
Same shape of problem: dashboard state and the runtime
feature_flagsclaimdisagreed.
What we'd like help understanding
Are there known propagation or caching delays between a dashboard toggle and
the values that appear in the
feature_flagsJWT claim on subsequent tokenissuance / refresh?
When a flag's targeting rule is widened (e.g. "specific org" → "all orgs"),
is it possible for orgs covered by the original rule to transiently drop out
of the new rule's evaluation set?
Is the
feature_flagsJWT claim evaluated:refresh / sign-in), or
Either is defensible; we'd just like to confirm so we can document expected
user impact when we toggle flags.
Is there an API or dashboard view that shows the effective evaluation for
a given (user, organization) pair so we can reproduce discrepancies
ourselves?
What we can share privately
Happy to share via support / a private channel:
x-workos-request-idheaders from token issuance around that timeLet us know what would be most useful and the right place to send it.