Skip to content

Subquery (SubPlan) pushdown#289

Open
JoshDreamland wants to merge 2 commits into
mainfrom
subquery-pushdown-v3-rebased
Open

Subquery (SubPlan) pushdown#289
JoshDreamland wants to merge 2 commits into
mainfrom
subquery-pushdown-v3-rebased

Conversation

@JoshDreamland

Copy link
Copy Markdown
Contributor

Folds correlated/uncorrelated scalar (EXPR), EXISTS, and ANY/IN subqueries, plus NOT EXISTS/NOT IN (ANTI joins), into the remote SQL. Rebased onto current main; supersedes the retired subquery-pushdown-v3 with all review feedback folded in. Draft pending one more pass.

@JoshDreamland JoshDreamland force-pushed the subquery-pushdown-v3-rebased branch from 0f1fab8 to 8c506c3 Compare July 1, 2026 17:10
Deparse the correlated and uncorrelated SubPlans that pull_up_sublinks
cannot flatten into joins, so they execute on ClickHouse instead of
being evaluated locally by Postgres. Covers correlated scalar subqueries
(TPC-H Q2/Q17), uncorrelated scalar subqueries in WHERE/HAVING
(Q11/Q22), IN with GROUP BY/HAVING (Q18), and NOT IN (Q16, deparsed as
LEFT ANTI JOIN).

Gate the pushdown on the ClickHouse server version: only 25.8+ supports
these correlated shapes; older analyzers error on the SQL, so below 25.8
the SubPlan runs locally. Verified the emitted SQL against ClickHouse
25.8, 26.3, and 26.5.

Tests pin the deparsed Remote SQL via EXPLAIN. The correlated executions
are EXPLAIN-only, since they run only on 25.8+ and would otherwise need
version-split expected output.
@JoshDreamland JoshDreamland force-pushed the subquery-pushdown-v3-rebased branch from 8c506c3 to d9c6d11 Compare July 1, 2026 17:57
@JoshDreamland JoshDreamland requested review from serprex and theory and removed request for theory July 1, 2026 18:04
@JoshDreamland JoshDreamland marked this pull request as ready for review July 1, 2026 18:05
@serprex serprex requested a review from theory July 1, 2026 18:45
Use an asterisk to indicate full pushdown but with multiple foreign
scans. Outcome: 11 now recorded as pushed down, and 2, 17, & 22 newly
push down.

@theory theory left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hell yeah, this is looking great. I have a few nitpicks, nothing serious. Thanks for working on this!

jobs:
build:
env: { pg: 19 }
env: { pg: 19, CH_RELEASE: "${{ matrix.ch }}" }

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for this change?

Comment thread src/deparse.c
/*
* SubPlan-interior aliases. plan_id is unique across PlannedStmt.subplans,
* so q{plan_id}_{varno} can never collide with the outer query's r{N}/s{N}
* aliases nor with a sibling SubPlan's tables. See deparseSubPlanQuery.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* aliases nor with a sibling SubPlan's tables. See deparseSubPlanQuery.
* aliases, nor with a sibling SubPlan's tables. See deparseSubPlanQuery.

Comment thread src/deparse.c
if (!IS_UPPER_REL(glob_cxt->foreignrel)) {
/*
* Not safe to pushdown when not in grouping context. Inside a
* SubPlan's Query a bare aggregate is legal (it is the

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* SubPlan's Query a bare aggregate is legal (it is the
* SubPlan's Query a bare aggregate is legal (it's the

Comment thread src/deparse.c
* Unsupported (and why):
* ALL_SUBLINK ClickHouse lacks a direct ALL; NOT IN arrives as
* NOT(ANY) and is handled by the BoolExpr case.
* ROWCOMPARE_SUBLINK multi-column compares not deparsed.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could they be in the future? Add "currently" if so.

Comment thread src/deparse.c
Comment on lines +955 to +956
* ALL_SUBLINK ClickHouse lacks a direct ALL; NOT IN arrives as
* NOT(ANY) and is handled by the BoolExpr case.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So can we not handle it as NOT IN(...)?

Comment thread src/deparse.c
/*
* Emit a pushed-down SubPlan subquery's FROM list (sans the FROM keyword).
*
* This looks like deparseFromExpr's job but cannot use it: there is no planned

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* This looks like deparseFromExpr's job but cannot use it: there is no planned
* This looks like deparseFromExpr but we cannot use it: there is no planned

Comment on lines +132 to +135
Foreign Scan
Output: i.item_id, s.amount
Relations: (items i) INNER JOIN (sales s)
Remote SQL: SELECT r1.item_id, r2.amount FROM subplan_test.items r1 ALL INNER JOIN subplan_test.sales r2 ON (((r1.item_id = r2.item_id))) WHERE (((SELECT max(q1_1.amount) FROM subplan_test.sales q1_1 WHERE ((q1_1.item_id = (r1.item_id)))) = r2.amount)) ORDER BY r1.item_id ASC NULLS LAST

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also execute the queries and verify the results are correct. Same for #3 above.

(2 rows)

-- ============================================================
-- 6. Scalar subquery in HAVING (TPC-H Q11 shape)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dupe: Q11 above.

Comment on lines +196 to +198
-- Query 2. EXPLAIN-only: this correlated subquery executes only on ClickHouse
-- 25.8+, so we pin the deparsed plan (version-independent) rather than the
-- execution (which is not).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we restore execution? Maybe check the version and don't run it on < 25.8.

Comment thread Makefile
Comment on lines +74 to +79
# The subquery-pushdown tests exercise SubPlan pushdown, which is gated on
# ClickHouse 25.8+ (older analyzers error on the correlated SQL). When CH_RELEASE
# names an older server, drop those tests and emit a GitHub Actions warning
# rather than carry a second set of "not pushed down" expected files. CH_RELEASE
# is unset for local runs and the Postgres matrix (which use the latest CH), so
# they keep the tests.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This confuses me. CH_RELEASE is currently set in the GitHub workflow but nowhere else. Someone downloading it and running the tests won't have it set. I don't have it set! The Makefile has not and should not have any idea of the ClickHouse version IMO. Nor should it have any awareness of GitHub Actions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants