perf: Skip validation for registered query cache hits#1134
perf: Skip validation for registered query cache hits#1134
Conversation
35c0024 to
9a11cb3
Compare
myronmarston
left a comment
There was a problem hiding this comment.
👍 Nice find! I do, however, have a potential concern: do we know if validate: false only turns off the validation of the query string, or does it also disable the validation of variables? If it only disables the former this seems safe, but if it also disables variable validation (e.g. verifying that values have been provided for non-null variables w/o defaults, that the variable values align with the variable types, etc), then this doesn't seem safe. Would be good to confirm that with validate: false, we still get the same validation errors as before when invalid variable values are provided.
In addition, there are two things that would be nice to do before we merge:
- Can we quantify the benefit this provides? e.g. can you use
benchmark-ipsand do a before/after comparison on different sizes of queries and report back about it? (That's how I got the prior 10ms estimate on parsing a large query). - It would be nice to include some kind of test coverage for this. I'm not sure the best way to do that--is there some kind of class method which we can mock and assert that it's called when given an unregistered query but not called for a registered query? (And verify that without your code change, the test fails indicating that it would have been called without passing
validate: false)?
Registered queries are validated at CI time via the QueryValidator Rake task. Re-validating them at runtime on every request is redundant work. The `prepare_query_for_execution` method already passes the cached `document:` to skip re-parsing. This extends that optimization by also passing `validate: false` to skip graphql-ruby's StaticValidation pipeline for queries that matched a registered form. This is safe because: - Registered queries are validated against the schema in CI before deploy - Schema and queries are deployed together as a single artifact - The `validate: false` path is only reached for cache hits (exact string match or previously-matched canonical form) - Unregistered queries and first-time queries still go through full validation via the `yield` path - `validate: false` only disables static (query-string) validation; variable validation still runs via `GraphQL::Query#variables`, so invalid variable values are still reported on every request Benchmarks (benchmarks/query_registry/skip_validation.rb) show that skipping static validation is essentially free (~13μs flat) while the `validate: true` path scales with query complexity: 178μs for a small query (13x), 1.8ms for a 4 KB query (137x), 26ms for a 60 KB query (1858x), and 163ms for a 400 KB query (11922x).
9a11cb3 to
e274b42
Compare
| operation_name: operation_name, | ||
| context: context | ||
| context: context, | ||
| # Registered queries are validated at CI time (via the QueryValidator Rake task), |
There was a problem hiding this comment.
It occurs to me that validation will only happen at CI time if the EG project has set that up. It's possible to use elasticgraph-query_registry without setting that up and, until this change, there wasn't a risk that queries would be unvalidated.
I think we should update the elasticgraph-query_registry README to mention that using the registry means that registered queries will not be validated at runtime and that the validation task must be used on CI. Also, it's probably worth mentioning (in the README) the additional benefits elasticgraph-query_registry provides where query performance is better due to eliminating the parsing and validition.
| end | ||
|
|
||
| context "when skipping redundant static validation for cache-hit registered queries" do | ||
| it "returns a query with `validate: false` for a byte-for-byte cache hit, but `validate: true` for the yield path" do |
There was a problem hiding this comment.
What does "yield path" refer to?
Edit: I also see that term used below in a few places; please update them as well.
| ) | ||
| end | ||
|
|
||
| context "when skipping redundant static validation for cache-hit registered queries" do |
There was a problem hiding this comment.
| context "when skipping redundant static validation for cache-hit registered queries" do | |
| describe "query validation" do |
when skipping redundant static validation for cache-hit registered queries isn't accurate for all the examples in here--you've got some examples which do not skip the static validation.
There was a problem hiding this comment.
Wow, this matters more than I would have thought! Thanks for providing the benchmark results.
| # First submission of the alternate form goes through the yield path; `validate` is the default (true). | ||
| first_query, _, status = registry.build_and_validate_query(modified_query_string, client: client_named("my_client")) | ||
| expect(status).to eq(RegistrationStatus::MATCHED_REGISTERED_QUERY) | ||
| expect(first_query.validate).to eq(true) |
There was a problem hiding this comment.
If I'm reading this right, it means we still pay the cost of validation the first time we encounter a query that differs slightly but is cannonically the same. Can we solve that? e.g. would this do the trick?
diff --git a/elasticgraph-query_registry/lib/elastic_graph/query_registry/query_validators/for_registered_client.rb b/elasticgraph-query_registry/lib/elastic_graph/query_registry/query_validators/for_registered_client.rb
index 644174e4..62b13f3e 100644
--- a/elasticgraph-query_registry/lib/elastic_graph/query_registry/query_validators/for_registered_client.rb
+++ b/elasticgraph-query_registry/lib/elastic_graph/query_registry/query_validators/for_registered_client.rb
@@ -90,7 +90,7 @@ module ElasticGraph
(_ = cached_client_data).with_updated_last_query(query_string, cachable_query)
end
- [query, [], registration_status]
+ [prepare_query_for_execution(query, variables: variables, operation_name: operation_name, context: context), [], registration_status]
end
private| second_query, _, status = registry.build_and_validate_query(modified_query_string, client: client_named("my_client")) | ||
| expect(status).to eq(RegistrationStatus::MATCHED_REGISTERED_QUERY) | ||
| expect(second_query.validate).to eq(false) | ||
| end |
There was a problem hiding this comment.
The elasticgraph-query_registry tests are relatively slow. As such, we should avoid duplicating test coverage. Can we consolidate some number of these tests? On the surface, it seems like there's some duplication here...
| allow(::GraphQL::StaticValidation::BaseVisitor).to receive(:including_rules).and_call_original | ||
| expect(query.valid?).to be true | ||
| expect(::GraphQL::StaticValidation::BaseVisitor).to have_received(:including_rules) | ||
| end |
There was a problem hiding this comment.
These tests are tightly coupled to current GraphQL gem internals (e.g. since they spy on ::GraphQL::StaticValidation::BaseVisitor which could be refactored away in a future GraphQL gem release). It's not clear to me that we need these tests given the tests above which check query.validate. But if you want these tests to run the validation for real (which has merit in general), an alternate idea: use a query which fails validation. Demonstrate that query.valid? is false when the query is unregistered and true when its registered even though it's not actually valid.
There was a problem hiding this comment.
Also: I'm remembering now that I recommended this approach in my prior review ("is there some kind of class method which we can mock and assert that it's called...?"). Sorry about changing my mind here but seeing the test helped me realize the potential better approach!
| expect(::GraphQL::StaticValidation::BaseVisitor).to have_received(:including_rules) | ||
| end | ||
|
|
||
| it "still validates variable values even when static validation is skipped" do |
There was a problem hiding this comment.
👍 thanks for adding a test for this! This one is worth keeping for sure, IMO.
Summary
Skip graphql-ruby's
StaticValidationpipeline for registered queries that hit the cache inForRegisteredClient#prepare_query_for_execution. These queries were already validated at CI time via theQueryValidatorRake task, so re-validating at runtime on every request is redundant.The
prepare_query_for_executionmethod already passes the cacheddocument:to skip re-parsing (~10ms on large queries). This extends that optimization by also passingvalidate: false, avoiding the validation pipeline cost on every cache-hit request.Safety
yieldpath inbuild_and_validate_queryallow_any_query_for_clientsqueries that don't hit the cache also go through full validationvalidate: falseonly disables static (query-string) validation. Variable validation still runs unconditionally inGraphQL::Query::ValidationPipelinevia@query.variables.errors(which coerces and validates each provided variable against its declared type), so invalid variable values are still reported on every request. There is a regression test covering this.Benchmark
benchmarks/query_registry/skip_validation.rbmeasures the cost ofGraphQL::Query.new(document:) + query.valid?withvalidate: true(prior behavior) vsvalidate: false(new cache-hit behavior). The skip cost is a flat ~13 μs; thevalidate: truecost scales with query complexity:Full results:
benchmarks/query_registry/skip_validation.results.txt.Test plan
elasticgraph-query_registryspecs pass (112 examples, 0 failures) with 100% line and branch coveragequery.validateisfalsefor byte-for-byte cache hits andtruefor the yield path (differing / first-time alternate form)GraphQL::StaticValidation::BaseVisitor.including_rulesis not called for cache-hit registered queries but is called for unregistered (yielded) queries — without thevalidate: falsechange these assertions flipscript/lintcleanscript/type_checkclean