Fix filters after outer joins silently turning them into inner joins#2103
Merged
Conversation
✅ Deploy Preview for thriving-cassata-78ae72 canceled.
|
WHERE-after-OUTER JOIN silently turning RIGHT/LEFT/FULL joins into INNER joins
WHERE-after-OUTER JOIN silently turning RIGHT/LEFT/FULL joins into INNER joins…an OUTER JOIN were being applied in WHERE, defeating the join.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When a v3 measures query joins to a dim via
RIGHT OUTER/FULL OUTER, filters on adjacent left-joined dims land in the outer where after the outer join. This silently breaks the outer join's preservation semantics and preserved-side rows (the ones the outer join exists to keep) get dropped whenever the filter's column is NULL on them. This PR fixes that and consolidates the related filter-routing logic.For queries that mix left join-with-filter and right/full outer, the result rows differ from before this PR: preserved-side rows from the outer join now survive when no matching fact row exists, matching the user's intent of right outer (preserve everything from the right). Queries that don't involve right/full are unaffected (pure-LEFT-chain filter narrowing semantics remain as today).
OUTER-JOIN-safe filter routing in CTE bodies
When a right or full outer join is present in the chain, every left/inner-joined dim that has a defeating filter is consolidated into a
<parent>_filteredCTE that applies its filters before the outer join reaches it. The outerFROMthen reads from the filtered CTE, and the outer join preserves correctly.Absorbed-dim references (t2.client_name) are rewritten to the wrapper alias (t1.client_name) across projection, GROUP BY, kept-join ONs, and remaining WHERE atoms. The absorber handles both qualification styles DJ uses: _table-set (projection/GROUP BY) and name.namespace-set (filter atoms via _add_table_prefixes_to_filter).
Add
COALESCEfor outer join FKsWhen the full-skip optimization substitutes a dim's PK with the fact's FK and a sibling dim shares the same FK alignment via its own link, the projection now emits
COALESCE(t1.foreign_key, t3.primary_key) AS final_keyand theGROUP BYmirrors theCOALESCE. This preserves correct values under outer joins that null-fill the fact side.Redundant outer-WHERE cleanup
The parent-alias atoms that are already pushed into the parent CTE (via the existing pushdown machinery) are no longer duplicated into the outer SELECT's WHERE. They were redundant and, under downstream OUTER JOINs, actively unsafe.
Test Plan
make checkpassesmake testshows 100% unit test coverageDeployment Plan