Skip to content

perf: Skip validation for registered query cache hits#1134

Open
jwils wants to merge 1 commit intomainfrom
joshuaw/skip-validation-registered-queries
Open

perf: Skip validation for registered query cache hits#1134
jwils wants to merge 1 commit intomainfrom
joshuaw/skip-validation-registered-queries

Conversation

@jwils
Copy link
Copy Markdown
Collaborator

@jwils jwils commented Apr 16, 2026

Summary

Skip graphql-ruby's StaticValidation pipeline for registered queries that hit the cache in ForRegisteredClient#prepare_query_for_execution. These queries were already validated at CI time via the QueryValidator Rake task, so re-validating at runtime on every request is redundant.

The prepare_query_for_execution method already passes the cached document: to skip re-parsing (~10ms on large queries). This extends that optimization by also passing validate: false, avoiding the validation pipeline cost on every cache-hit request.

Safety

  • Only applies to the cache-hit path — queries that exactly match a registered form (or a previously-matched canonical form)
  • Unregistered queries and first-time queries still go through full validation via the yield path in build_and_validate_query
  • allow_any_query_for_clients queries that don't hit the cache also go through full validation
  • Schema and registered queries are deployed as a single artifact, validated together in CI
  • validate: false only disables static (query-string) validation. Variable validation still runs unconditionally in GraphQL::Query::ValidationPipeline via @query.variables.errors (which coerces and validates each provided variable against its declared type), so invalid variable values are still reported on every request. There is a regression test covering this.

Benchmark

benchmarks/query_registry/skip_validation.rb measures the cost of GraphQL::Query.new(document:) + query.valid? with validate: true (prior behavior) vs validate: false (new cache-hit behavior). The skip cost is a flat ~13 μs; the validate: true cost scales with query complexity:

Size Query bytes validate: true validate: false Speedup Savings/request
small ~280 B 178 μs 13 μs 13.6× ~165 μs
medium ~4 KB 1.83 ms 13 μs 137× ~1.8 ms
large ~60 KB 26 ms 14 μs 1,858× ~26 ms
xlarge ~416 KB 163 ms 14 μs 11,922× ~163 ms

Full results: benchmarks/query_registry/skip_validation.results.txt.

Test plan

  • elasticgraph-query_registry specs pass (112 examples, 0 failures) with 100% line and branch coverage
  • New unit tests directly assert the validation-skip behavior:
    • query.validate is false for byte-for-byte cache hits and true for the yield path (differing / first-time alternate form)
    • GraphQL::StaticValidation::BaseVisitor.including_rules is not called for cache-hit registered queries but is called for unregistered (yielded) queries — without the validate: false change these assertions flip
    • Invalid variable values on a cache-hit query still produce static errors (safety test confirming variable validation is unaffected)
  • script/lint clean
  • script/type_check clean

@jwils jwils force-pushed the joshuaw/skip-validation-registered-queries branch from 35c0024 to 9a11cb3 Compare April 16, 2026 16:37
Copy link
Copy Markdown
Collaborator

@myronmarston myronmarston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Nice find! I do, however, have a potential concern: do we know if validate: false only turns off the validation of the query string, or does it also disable the validation of variables? If it only disables the former this seems safe, but if it also disables variable validation (e.g. verifying that values have been provided for non-null variables w/o defaults, that the variable values align with the variable types, etc), then this doesn't seem safe. Would be good to confirm that with validate: false, we still get the same validation errors as before when invalid variable values are provided.

In addition, there are two things that would be nice to do before we merge:

  • Can we quantify the benefit this provides? e.g. can you use benchmark-ips and do a before/after comparison on different sizes of queries and report back about it? (That's how I got the prior 10ms estimate on parsing a large query).
  • It would be nice to include some kind of test coverage for this. I'm not sure the best way to do that--is there some kind of class method which we can mock and assert that it's called when given an unregistered query but not called for a registered query? (And verify that without your code change, the test fails indicating that it would have been called without passing validate: false)?

Registered queries are validated at CI time via the QueryValidator Rake
task. Re-validating them at runtime on every request is redundant work.

The `prepare_query_for_execution` method already passes the cached
`document:` to skip re-parsing. This extends that optimization by also
passing `validate: false` to skip graphql-ruby's StaticValidation
pipeline for queries that matched a registered form.

This is safe because:
- Registered queries are validated against the schema in CI before deploy
- Schema and queries are deployed together as a single artifact
- The `validate: false` path is only reached for cache hits (exact string
  match or previously-matched canonical form)
- Unregistered queries and first-time queries still go through full
  validation via the `yield` path
- `validate: false` only disables static (query-string) validation;
  variable validation still runs via `GraphQL::Query#variables`, so
  invalid variable values are still reported on every request

Benchmarks (benchmarks/query_registry/skip_validation.rb) show that
skipping static validation is essentially free (~13μs flat) while the
`validate: true` path scales with query complexity: 178μs for a small
query (13x), 1.8ms for a 4 KB query (137x), 26ms for a 60 KB query
(1858x), and 163ms for a 400 KB query (11922x).
@jwils jwils force-pushed the joshuaw/skip-validation-registered-queries branch from 9a11cb3 to e274b42 Compare April 16, 2026 22:23
operation_name: operation_name,
context: context
context: context,
# Registered queries are validated at CI time (via the QueryValidator Rake task),
Copy link
Copy Markdown
Collaborator

@myronmarston myronmarston Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It occurs to me that validation will only happen at CI time if the EG project has set that up. It's possible to use elasticgraph-query_registry without setting that up and, until this change, there wasn't a risk that queries would be unvalidated.

I think we should update the elasticgraph-query_registry README to mention that using the registry means that registered queries will not be validated at runtime and that the validation task must be used on CI. Also, it's probably worth mentioning (in the README) the additional benefits elasticgraph-query_registry provides where query performance is better due to eliminating the parsing and validition.

end

context "when skipping redundant static validation for cache-hit registered queries" do
it "returns a query with `validate: false` for a byte-for-byte cache hit, but `validate: true` for the yield path" do
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "yield path" refer to?

Edit: I also see that term used below in a few places; please update them as well.

)
end

context "when skipping redundant static validation for cache-hit registered queries" do
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
context "when skipping redundant static validation for cache-hit registered queries" do
describe "query validation" do

when skipping redundant static validation for cache-hit registered queries isn't accurate for all the examples in here--you've got some examples which do not skip the static validation.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this matters more than I would have thought! Thanks for providing the benchmark results.

# First submission of the alternate form goes through the yield path; `validate` is the default (true).
first_query, _, status = registry.build_and_validate_query(modified_query_string, client: client_named("my_client"))
expect(status).to eq(RegistrationStatus::MATCHED_REGISTERED_QUERY)
expect(first_query.validate).to eq(true)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading this right, it means we still pay the cost of validation the first time we encounter a query that differs slightly but is cannonically the same. Can we solve that? e.g. would this do the trick?

diff --git a/elasticgraph-query_registry/lib/elastic_graph/query_registry/query_validators/for_registered_client.rb b/elasticgraph-query_registry/lib/elastic_graph/query_registry/query_validators/for_registered_client.rb
index 644174e4..62b13f3e 100644
--- a/elasticgraph-query_registry/lib/elastic_graph/query_registry/query_validators/for_registered_client.rb
+++ b/elasticgraph-query_registry/lib/elastic_graph/query_registry/query_validators/for_registered_client.rb
@@ -90,7 +90,7 @@ module ElasticGraph
             (_ = cached_client_data).with_updated_last_query(query_string, cachable_query)
           end

-          [query, [], registration_status]
+          [prepare_query_for_execution(query, variables: variables, operation_name: operation_name, context: context), [], registration_status]
         end

         private

second_query, _, status = registry.build_and_validate_query(modified_query_string, client: client_named("my_client"))
expect(status).to eq(RegistrationStatus::MATCHED_REGISTERED_QUERY)
expect(second_query.validate).to eq(false)
end
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The elasticgraph-query_registry tests are relatively slow. As such, we should avoid duplicating test coverage. Can we consolidate some number of these tests? On the surface, it seems like there's some duplication here...

allow(::GraphQL::StaticValidation::BaseVisitor).to receive(:including_rules).and_call_original
expect(query.valid?).to be true
expect(::GraphQL::StaticValidation::BaseVisitor).to have_received(:including_rules)
end
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are tightly coupled to current GraphQL gem internals (e.g. since they spy on ::GraphQL::StaticValidation::BaseVisitor which could be refactored away in a future GraphQL gem release). It's not clear to me that we need these tests given the tests above which check query.validate. But if you want these tests to run the validation for real (which has merit in general), an alternate idea: use a query which fails validation. Demonstrate that query.valid? is false when the query is unregistered and true when its registered even though it's not actually valid.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: I'm remembering now that I recommended this approach in my prior review ("is there some kind of class method which we can mock and assert that it's called...?"). Sorry about changing my mind here but seeing the test helped me realize the potential better approach!

expect(::GraphQL::StaticValidation::BaseVisitor).to have_received(:including_rules)
end

it "still validates variable values even when static validation is skipped" do
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 thanks for adding a test for this! This one is worth keeping for sure, IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants