[subquery][test] Preserve COUNT-containing scalar-subquery semantics by Excaliiiibur · Pull Request #1 · Excaliiiibur/gpdb

Excaliiiibur · 2026-04-22T08:37:42Z

Related to #2

Summary

This is not an ORCA issue; it occurs in the fallback PostgreSQL optimizer rewrite path.

Fix incorrect semantics in correlated scalar aggregate subquery pull-up when target expression contains COUNT.

This risk applies not only to manual optimizer=off, but also to other cases where planning falls back from ORCA to PostgreSQL optimizer.

What Changed

In cdbsubselect.c (convert_EXPR_to_join), for COUNT-containing scalar aggregate expressions:

use LEFT JOIN in rewrite
replace pulled-up expression with COALESCE(agg_expr, default_expr)
derive default_expr from empty-input aggregate defaults:
- COUNT(...) -> 0
- non-COUNT aggregates -> NULL (aggregate result type)

This preserves no-match semantics for both COUNT-only and mixed COUNT+other aggregate expressions.

Tests Updated

Added/updated subselect_gp regression cases for:

count(*) + 1
(count(*) + 1)::bigint
case when count(*) > 0 then count(*) + 1 else 1 end
abs(count(*) - 1)
count(*) + coalesce(sum(...), 0)
coalesce(count(*) + sum(...), -1)

Files

src/backend/cdb/cdbsubselect.c
src/test/regress/sql/subselect_gp.sql
src/test/regress/expected/subselect_gp.out
src/test/regress/expected/subselect_gp_optimizer.out

…back rewrite Fix scalar-subquery pull-up in cdbsubselect for target expressions that contain COUNT aggregates. In the fallback PostgreSQL optimizer rewrite path, preserve no-match semantics with LEFT JOIN + COALESCE(agg_expr, default_expr). The default expression is derived from the original aggregate expression tree using empty-input defaults: COUNT -> 0, non-COUNT aggregates -> NULL of aggregate result type. Add regression coverage for COUNT-only and mixed COUNT+other aggregate expressions (including CASE/FuncExpr/COALESCE wrappers), and keep explicit fallback-path validation (non-ORCA-specific).

Skip the trailing typmod conversion when the result is already marked NULL. Add more edge cases to check extension stability.

…rameter. (open-gpdb#379) Avoid fetching one past the end of translate()'s "to" parameter. This is usually harmless, but if you were very unlucky it could provoke a segfault due to the "to" string being right up against the end of memory. Found via valgrind testing (so we might've found it earlier, except that our regression tests lacked any exercise of translate()'s deletion feature). Fix by switching the order of the test-for-end-of-string and advance-pointer steps. While here, compute "to_ptr + tolen" just once. (Smarter compilers might figure that out for themselves, but let's just make sure.) Report and fix by Daniil Anisimov, in bug #17816. Discussion: https://postgr.es/m/17816-70f3d2764e88a108@postgresql.org Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Backported-by: reshke <reshke@double.cloud> Reviewed-by: rkhapov <rkhapov@yandex-team.ru> Reviewed-by: leborchuk <xifos@yandex-team.ru>

The band-aid applied in commit f0bedf3 turns out to still need some work: it made sure we didn't set Np->last_relevant too small (to the left of the decimal point), but it didn't prevent setting it too large (off the end of the partially-converted string). This could result in fetching data beyond the end of the allocated space, which with very bad luck could cause a SIGSEGV, though I don't see any hazard of interesting memory disclosure. Per bug #17839 from Thiago Nunes. The bug's pretty ancient, so back-patch to all supported versions. Discussion: https://postgr.es/m/17839-aada50db24d7b0da@postgresql.org Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Backport-by: reshke

…nsumer (open-gpdb#375) Inspired by greengage 51fe92e: before Expr->DXL translation, walk the physical tree and track which slice each CTE Producer and Consumer lives on. If a Consumer is on a different slice than its Producer and the Producer's distribution is replicated, force a fallback to the Postgres optimizer. The replicated filter is essential: ordinary cross-slice CTE plans (non-replicated Producer with Gather/Redistribute Consumer) are a normal ORCA pattern and must not trigger fallback. 51fe92e doesn't trigger when a CTE over a replicated table is referenced from a scalar subquery, so the query hangs. This commit replaces the single-point check with a whole-tree walker that catches both cases. Tests: bring the original qp_orca_fallback case from 51fe92e and add a scalar-subquery reproducer in shared_scan guarded by statement_timeout.

…pen-gpdb#383) gistBuildCallback tried to fetch the size of an index tuple that might have already been freed by gistProcessEmptyingQueue. While this seems to usually be harmless in production builds, in principle it could result in a SIGSEGV, or more likely a bogus value for indtuplesSize leading to poor page-split decisions later in the build. The memory management here is confusing and could stand to be refactored, but for the moment it seems to be enough to fetch the tuple size sooner. AFAICT the indtuples[Size] totals aren't used in between these places; even if they were, the updated values shouldn't be any worse to use. So just move the incrementing of the totals up. It's not very clear why our valgrind-using buildfarm animals haven't noticed this problem, because the relevant code path does seem to be exercised according to the code coverage report. I think the reason that we didn't fix this bug after the first report is that I'd wanted to try to understand that better. However, now that it's been re-discovered let's just be pragmatic and fix it already. Original report by Alexander Lakhin (bug #16329), later rediscovered by Egor Chindyaskin (bug #17874). Patch by Alexander Lakhin (commentary by Pavel Borisov and me). Back-patch to all supported branches. Discussion: https://postgr.es/m/16329-7a6aa9b6fa1118a1@postgresql.org Discussion: https://postgr.es/m/17874-63ca6c7ce42d2103@postgresql.org Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>

…pdb#385)

) The OTM buckets gpyezzey2/gpyezzey3 had no credentials, so yproxy rejected offload into the tab1/tab2 tablespaces. Add a credential_map with the access keys for all buckets to fix it.

Excaliiiibur mentioned this pull request Apr 22, 2026

[subquery][test] COUNT-containing scalar subquery rewrite in fallback PostgreSQL optimizer path #2

Open

Alena0704 and others added 8 commits April 22, 2026 12:42

Fix crash on NULL fallback with typmod.

d8cc486

Skip the trailing typmod conversion when the result is already marked NULL. Add more edge cases to check extension stability.

Force public schema for gp_interconnect_stats.control install (open-g…

13fdc15

…pdb#385)

Fix CI: add S3 credentials for OTM buckets in yproxy.conf (open-gpdb#386

fd79f4f

) The OTM buckets gpyezzey2/gpyezzey3 had no credentials, so yproxy rejected offload into the tab1/tab2 tablespaces. Add a credential_map with the access keys for all buckets to fix it.

Merge branch 'OPENGPDB_STABLE' into bugfix/count-aggr-first-fallback

407d530

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[subquery][test] Preserve COUNT-containing scalar-subquery semantics#1

[subquery][test] Preserve COUNT-containing scalar-subquery semantics#1
Excaliiiibur wants to merge 9 commits into
OPENGPDB_STABLEfrom
bugfix/count-aggr-first-fallback

Excaliiiibur commented Apr 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Excaliiiibur commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Tests Updated

Files

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Excaliiiibur commented Apr 22, 2026 •

edited

Loading