Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs
In [1]/messages/by-id/18904-c5fea7892f4d26ed@postgresql.org there was a report that set operations didn't correctly detect
when inputs were provably empty sets. While this is not the bug that
the report claimed it to be, as it's just a missing optimisation, I
did decide to look at it to check if there was much performance to
gain from doing this.
The short of it is, Yes, there are cases when this can help query
performance. Primarily, this seems to come from when the code detects
that an EXCEPT ALL has an empty right-hand input. In this case, we can
scan the left-hand input and forego the SetOp node completely.
With EXCEPT (without ALL), deduplication is still required, however
that can be done with an Aggregate node on the left input rather than
using the slightly less efficient SetOp node.
If I create two tables with a single int column containing 1 million
rows each, ANALYZE them and run some queries with and without the
patch, I see:
(work_mem=256MB, pgbench -M simple -T 10)
master:
EXCEPT ALL left dummy : tps = 8466.587802
EXCEPT ALL right dummy : tps = 3.160083
EXCEPT left dummy : tps = 8433.607519
EXCEPT right dummy : tps = 3.178104
INTERSECT (all types) : tps = 8392.695606
UNION left dummy : tps = 3.406355
patched:
EXCEPT ALL left dummy : tps = 8973.958896 (+5.99%)
EXCEPT ALL right dummy : tps = 53.583312 (+1595.63%)
EXCEPT left dummy : tps = 8736.716176 (+3.59%)
EXCEPT right dummy : tps = 3.385520 (+6.53%)
INTERSECT (all types) : tps = 8759.123942 (+4.37%)
UNION left dummy : tps = 3.590264 (+5.40%)
You can see EXCEPT ALL with the empty right-hand input became ~15x
faster, and all the others became ~5% faster.
There are some additional benefits aside from the performance as it's
possible to provide better row estimates in certain cases. For
example, if a UNION query removes all apart from 1 input, we can do
estimate_num_groups() on that input. Otherwise, we're left to the
assumption that all rows are unique, which certainly could cause some
trouble later in planning for queries consuming the results of set
operations in subqueries. EXCEPT with an empty right-hand input also
benefits from improved row estimates for the same reason.
For me, this seems worth doing. Set operations have been drawn out of
the dark ages with the last few releases, and I feel this makes them
more aligned to the set of optimisations we've come to expect in other
parts of the planner.
I'm happy to hear other opinions.
Patch attached.
David
Attachments:
v1-0001-Teach-planner-to-short-circuit-plans-with-empty-s.patchapplication/octet-stream; name=v1-0001-Teach-planner-to-short-circuit-plans-with-empty-s.patchDownload+360-10
David Rowley <dgrowleyml@gmail.com> writes:
In [1] there was a report that set operations didn't correctly detect
when inputs were provably empty sets. While this is not the bug that
the report claimed it to be, as it's just a missing optimisation, I
did decide to look at it to check if there was much performance to
gain from doing this.
I'm kind of resistant to the amount of code this patch adds in
comparison to the likely benefit. Sure, a badly written query can
profit, but is it worth debugging and maintaining a couple hundred
lines of code for that?
The first few hunks of changes seem fine by this light, but I think
you're expending too much effort on the EXCEPT-with-dummy-inputs
cases. I'm wondering if it could be shortened a great deal by
handling left-input-dummy and EXCEPT-ALL-with-right-input-dummy
but leaving the EXCEPT-with-right-input-dummy case unimproved.
regards, tom lane
On Thu, 2 Oct 2025 at 16:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I'm wondering if it could be shortened a great deal by
handling left-input-dummy and EXCEPT-ALL-with-right-input-dummy
but leaving the EXCEPT-with-right-input-dummy case unimproved.
Good idea. Less code and still get to keep the one that did well in
the benchmark. See attached.
I ended up splitting the patch in two. 0001 for UNION only, then 0002
for the INTERSECT and EXCEPT.
David
Attachments:
v2-0001-Teach-UNION-planner-to-remove-dummy-inputs.patchapplication/octet-stream; name=v2-0001-Teach-UNION-planner-to-remove-dummy-inputs.patchDownload+120-10
v2-0002-Teach-planner-to-short-circuit-EXCEPT-INTERSECT-w.patchapplication/octet-stream; name=v2-0002-Teach-planner-to-short-circuit-EXCEPT-INTERSECT-w.patchDownload+147-1
Hi,
David Rowley <dgrowleyml@gmail.com>于2025年10月2日 周四20:09写道:
On Thu, 2 Oct 2025 at 16:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I'm wondering if it could be shortened a great deal by
handling left-input-dummy and EXCEPT-ALL-with-right-input-dummy
but leaving the EXCEPT-with-right-input-dummy case unimproved.Good idea. Less code and still get to keep the one that did well in
the benchmark. See attached.I ended up splitting the patch in two. 0001 for UNION only, then 0002
for the INTERSECT and EXCEPT.David
It seems that the optimization for `UNION ALL` is already implemented in
the patch: it removes empty sub-paths and preserves the remaining ones.
Should we add a test case to formally validate this behavior like Union
cases?
David Rowley <dgrowleyml@gmail.com> writes:
Good idea. Less code and still get to keep the one that did well in
the benchmark. See attached.
0001's change in is_dummy_rel() seems like a complete hack, especially
since you didn't bother to change the adjacent comment. Why is that
necessary?
v2 LGTM otherwise.
regards, tom lane
On Fri, 3 Oct 2025 at 04:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:
0001's change in is_dummy_rel() seems like a complete hack, especially
since you didn't bother to change the adjacent comment. Why is that
necessary?
build_setop_child_paths() wraps the child inputs in SubqueryScanPaths,
so we need to see through those.
An alternative way would be to propagate those during build_setop_child_paths()
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -523,6 +523,13 @@ build_setop_child_paths(PlannerInfo *root, RelOptInfo *rel,
bool is_sorted;
int presorted_keys;
+ /* If the input rel is dummy, propagate that to this query level */
+ if (is_dummy_rel(final_rel))
+ {
+ mark_dummy_rel(rel);
+ continue;
+ }
+
As attached.
v2 LGTM otherwise.
Thanks
David
Attachments:
v3-0001-Teach-UNION-planner-to-remove-dummy-inputs.patchapplication/octet-stream; name=v3-0001-Teach-UNION-planner-to-remove-dummy-inputs.patchDownload+125-10
v3-0002-Teach-planner-to-short-circuit-EXCEPT-INTERSECT-w.patchapplication/octet-stream; name=v3-0002-Teach-planner-to-short-circuit-EXCEPT-INTERSECT-w.patchDownload+147-1
David Rowley <dgrowleyml@gmail.com> writes:
On Fri, 3 Oct 2025 at 04:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:
0001's change in is_dummy_rel() seems like a complete hack, especially
since you didn't bother to change the adjacent comment. Why is that
necessary?
build_setop_child_paths() wraps the child inputs in SubqueryScanPaths,
so we need to see through those.
Ah.
An alternative way would be to propagate those during build_setop_child_paths()
That answer works for me. I was expecting you to just document the
need for the extra check in is_dummy_rel ;-) ... but this way is
perhaps better.
regards, tom lane
On Fri, 3 Oct 2025 at 02:55, Mingli Zhang <zmlpostgres@gmail.com> wrote:
It seems that the optimization for `UNION ALL` is already implemented in the patch: it removes empty sub-paths and preserves the remaining ones.
Should we add a test case to formally validate this behavior like Union cases?
If I were to do that, I'd have to come up with something that's
flatten_simple_union_all() proof. Maybe something like varying types
in the targetlist. I think it's probably not really worthwhile since
it's not testing any new code that is not already being tested by the
tests that I did add.
David
On Fri, 3 Oct 2025 at 10:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Rowley <dgrowleyml@gmail.com> writes:
An alternative way would be to propagate those during build_setop_child_paths()
That answer works for me. I was expecting you to just document the
need for the extra check in is_dummy_rel ;-) ... but this way is
perhaps better.
So, I pushed the UNION portion earlier, but on hacking more on the
EXCEPT/INTERSECT patch, I noticed that I don't have the target lists
correct when marking the top-level set op as dummy. I had thought it
was ok to use the target list of the first child. I did that to make
EXPLAIN VERBOSE work as it chokes on the varno==0 top-level setop
targetlist as generated by generate_append_tlist(). However, using the
target list of the first child isn't right as createplan will choke on
not finding PathKeys to sort.
It looks like the normal UNION case side steps this issue for T_SetOp
by applying set_dummy_tlist_references() in setrefs.c. That doesn't
happen for Result since we may have something to evaluate there.
I'm just considering the best fix and can think of two options:
1) Move away from using varno==0 in generate_append_tlist(). Use varno==1, or;
2) Add handling in setrefs.c for T_Result to adjust varno==0 Vars to
use varno==1 vars.
The attached v4-0001 does #2, but wondering if #1 should be explored first.
David
Attachments:
v4-0001-Fix-incorrect-use-of-targetlist-in-dummy-UNIONs.patchapplication/octet-stream; name=v4-0001-Fix-incorrect-use-of-targetlist-in-dummy-UNIONs.patchDownload+33-13
v4-0002-Teach-planner-to-short-circuit-EXCEPT-INTERSECT-w.patchapplication/octet-stream; name=v4-0002-Teach-planner-to-short-circuit-EXCEPT-INTERSECT-w.patchDownload+185-1
David Rowley <dgrowleyml@gmail.com> writes:
I'm just considering the best fix and can think of two options:
1) Move away from using varno==0 in generate_append_tlist(). Use varno==1, or;
2) Add handling in setrefs.c for T_Result to adjust varno==0 Vars to
use varno==1 vars.
The attached v4-0001 does #2, but wondering if #1 should be explored first.
I don't recall the details ATM, but if you poke around you will find
multiple comments complaining about how that varno-zero convention is
problematic or requires code to do something unusual. So I'd be in
favor of trying to get rid of it, but I'm not entirely sure what to do
instead, and the ramifications might be wider than you realize.
In particular it's not clear to me why varno==1 is better? As best
I can recall without diving into code, the fundamental mismatch is
that varno zero doesn't correspond to any RTE. It would be better
if the Vars matched the subquery RTEs that are at the base of the
set-operation nest, so that there were a useful referent as to where
a Var came from. Arbitrarily setting varno=1 sounds like the worst
case: we could neither identify a Var with a source subquery
accurately, nor realize that its varno is phony.
regards, tom lane
On Sun, 5 Oct 2025 at 04:43, Tom Lane <tgl@sss.pgh.pa.us> wrote:
In particular it's not clear to me why varno==1 is better? As best
I can recall without diving into code, the fundamental mismatch is
that varno zero doesn't correspond to any RTE. It would be better
if the Vars matched the subquery RTEs that are at the base of the
set-operation nest, so that there were a useful referent as to where
a Var came from. Arbitrarily setting varno=1 sounds like the worst
case: we could neither identify a Var with a source subquery
accurately, nor realize that its varno is phony.
I've not tried it yet, but the idea with varno==1 is that is that it
does index the first UNION child's RTE. For the case at hand, all the
children are dummies. I thought doing this was ok because Appends and
MergeAppends show the target entries for the first child, and cases
like the following do show Vars from RTEs that don't get scanned in
the plan:
# explain verbose select * from t where 1=2 order by 1;
Sort (cost=0.01..0.02 rows=0 width=0)
Output: a, b
Sort Key: t.a
-> Result (cost=0.00..0.00 rows=0 width=0)
Output: a, b
Replaces: Scan on t
One-Time Filter: false
Another alternative that I'm thinking about which might be better is
to double-down on the varno==0 and invent a special varno and define
SETOP_VAR. I'd feel better about doing that as I didn't feel good
about coding the magic number for the if(var->varno == 0) check in
set_plan_refs() to make the T_Result work in EXPLAIN VERBOSE.
David
On Sun, 5 Oct 2025 at 13:56, David Rowley <dgrowleyml@gmail.com> wrote:
Another alternative that I'm thinking about which might be better is
to double-down on the varno==0 and invent a special varno and define
SETOP_VAR. I'd feel better about doing that as I didn't feel good
about coding the magic number for the if(var->varno == 0) check in
set_plan_refs() to make the T_Result work in EXPLAIN VERBOSE.
I decided not to do it that way and instead just added code to create
a varno==1 Var in setrefs.c. This basically amounts to following on
with the varno==0 hack used in prepunion.c.
The reason I didn't go down the route of SETOP_VAR was that it's still
a hack, it's just making it look a bit more official. I suppose the
correct way to fix all this and get rid of the varno==0 stuff forever
is to have a proper top-level RTE for the top-level set operation and
make it so each child is an OTHER_MEMBER rel at that query level. It
felt like going a bit too far to do something like that to fix this
bug, so I didn't explore that further.
David
David Rowley <dgrowleyml@gmail.com> writes:
The reason I didn't go down the route of SETOP_VAR was that it's still
a hack, it's just making it look a bit more official. I suppose the
correct way to fix all this and get rid of the varno==0 stuff forever
is to have a proper top-level RTE for the top-level set operation and
make it so each child is an OTHER_MEMBER rel at that query level. It
felt like going a bit too far to do something like that to fix this
bug, so I didn't explore that further.
Yeah, I think "more RTEs" is the ultimate solution here, but it's
never risen to the top of my to-do list either. I was kind of
thinking about an RTE per set-op child, though. Not sure if one
for the top-level op, or one for an intermediate op, would help;
though it's certainly possible it would.
regards, tom lane
Hello David,
04.10.2025 06:55, David Rowley wrote:
On Fri, 3 Oct 2025 at 10:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Rowley <dgrowleyml@gmail.com> writes:
An alternative way would be to propagate those during build_setop_child_paths()
That answer works for me. I was expecting you to just document the
need for the extra check in is_dummy_rel ;-) ... but this way is
perhaps better.So, I pushed the UNION portion earlier, but on hacking more on the
EXCEPT/INTERSECT patch, I noticed that I don't have the target lists
correct when marking the top-level set op as dummy. ...
Please look at a new anomaly, introduced with 03d40e4b5:
CREATE TABLE t(i integer);
CREATE TABLE pt(i integer) PARTITION BY LIST(i);
SET enable_seqscan = 'off';
SELECT * FROM t UNION SELECT * FROM t
UNION ALL
SELECT * FROM pt;
produces:
ERROR: XX000: unrecognized node type: 0
LOCATION: create_plan_recurse, createplan.c:538
Best regards.
Alexander
On Tue, Nov 4, 2025 at 5:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
Please look at a new anomaly, introduced with 03d40e4b5:
CREATE TABLE t(i integer);
CREATE TABLE pt(i integer) PARTITION BY LIST(i);SET enable_seqscan = 'off';
SELECT * FROM t UNION SELECT * FROM t
UNION ALL
SELECT * FROM pt;
produces:
ERROR: XX000: unrecognized node type: 0
LOCATION: create_plan_recurse, createplan.c:538
I looked into this. The child relation with relid 3 (the scan on the
partitioned table) is a dummy, so it is skipped in
generate_union_paths(). As a result, the final setop relation ends up
the same as the child relation with relid set to (1, 2). Then,
generate_union_paths() creates an Append path using this relation's
cheapest path as its subpath. Somehow, add_path() determines that
this new Append path dominates the original cheapest path, causing the
original cheapest path to be freed. This leads to the final Append
path referencing a subpath that has already been freed.
- Richard
On Tue, 4 Nov 2025 at 21:00, Alexander Lakhin <exclusion@gmail.com> wrote:
Please look at a new anomaly, introduced with 03d40e4b5:
CREATE TABLE t(i integer);
CREATE TABLE pt(i integer) PARTITION BY LIST(i);SET enable_seqscan = 'off';
SELECT * FROM t UNION SELECT * FROM t
UNION ALL
SELECT * FROM pt;
produces:
ERROR: XX000: unrecognized node type: 0
Thanks for the report.
This seems to be due to the fact that the same UPPERREL_SETOP
RelOptInfo is being used for the UNION and UNION ALL. When doing the
add_path() for the UNION ALL at prepunion.c:1030, the sole child path
of the Append being added is an AggPath for the previous UNION
operation. When we add_path() the AppendPath, the previous AggPath is
already in the pathlist of the result_rel. add_path() ends up freeing
the old Path due to the new AppendPath being better and the end result
is the Append's subpath is free'd.
The reason we end up with the same result_rel is that we're not
passing all the relids in fetch_upper_rel(root, UPPERREL_SETOP,
relids) due to having removed dummy rels. I guess the fix might be
something like record the relids even when skipping dummy relations.
I'll go and explore that as an option.
David
On Tue, 4 Nov 2025 at 22:54, David Rowley <dgrowleyml@gmail.com> wrote:
The reason we end up with the same result_rel is that we're not
passing all the relids in fetch_upper_rel(root, UPPERREL_SETOP,
relids) due to having removed dummy rels. I guess the fix might be
something like record the relids even when skipping dummy relations.
I'll go and explore that as an option.
This seems to fix it. I'll study it more in the morning (it's late in
my time zone).
David
Attachments:
minimal_fix.patchapplication/octet-stream; name=minimal_fix.patchDownload+6-2
On Tue, Nov 4, 2025 at 6:19 PM Richard Guo <guofenglinux@gmail.com> wrote:
On Tue, Nov 4, 2025 at 5:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
Please look at a new anomaly, introduced with 03d40e4b5:
CREATE TABLE t(i integer);
CREATE TABLE pt(i integer) PARTITION BY LIST(i);SET enable_seqscan = 'off';
SELECT * FROM t UNION SELECT * FROM t
UNION ALL
SELECT * FROM pt;
produces:
ERROR: XX000: unrecognized node type: 0
LOCATION: create_plan_recurse, createplan.c:538I looked into this. The child relation with relid 3 (the scan on the
partitioned table) is a dummy, so it is skipped in
generate_union_paths(). As a result, the final setop relation ends up
the same as the child relation with relid set to (1, 2). Then,
generate_union_paths() creates an Append path using this relation's
cheapest path as its subpath. Somehow, add_path() determines that
this new Append path dominates the original cheapest path, causing the
original cheapest path to be freed. This leads to the final Append
path referencing a subpath that has already been freed.
I think a quick fix is to always include the involved child relids
when building the relid set for the setop relation, even if the child
is dummy. A dummy child does not yield any rows, but theoretically it
is still a relation that we should account for.
- Richard
Attachments:
v1-0001-Fix-generate_union_paths.patchapplication/octet-stream; name=v1-0001-Fix-generate_union_paths.patchDownload+9-5
On Tue, 4 Nov 2025 at 23:00, David Rowley <dgrowleyml@gmail.com> wrote:
On Tue, 4 Nov 2025 at 22:54, David Rowley <dgrowleyml@gmail.com> wrote:
The reason we end up with the same result_rel is that we're not
passing all the relids in fetch_upper_rel(root, UPPERREL_SETOP,
relids) due to having removed dummy rels. I guess the fix might be
something like record the relids even when skipping dummy relations.
I'll go and explore that as an option.This seems to fix it. I'll study it more in the morning (it's late in
my time zone).
I went over this again this morning. I considered adding the following test:
-- Try a more complex case with multiple chained UNIONs
EXPLAIN (COSTS OFF)
SELECT two FROM tenk1
UNION
SELECT four FROM tenk1
UNION ALL
SELECT ten FROM tenk1 WHERE 1=2;
but it seems that enable_seqscan does also need to be disabled as
otherwise add_path() just finds the new and old path to cost the same
and rejects the new path. With enable_seqscan = off,
compare_path_costs_fuzzily() will find the old path to have
disabled_node = 2 and the new one to have disabled_nodes = 0, so
accepts the new and pfree's the old.
I finally decided that it was a bit too obscure a scenario to test to
verify that the same silly mistake didn't reappear.
Thanks again for the report and the simple recreation steps.
David
Hello David!
05.11.2025 00:55, David Rowley wrote:
I finally decided that it was a bit too obscure a scenario to test to
verify that the same silly mistake didn't reappear.Thanks again for the report and the simple recreation steps.
Thank you for the fix!
But while playing around, I've just discovered another interesting error:
SET enable_seqscan = 'off';
SELECT * FROM t EXCEPT SELECT * FROM t
UNION SELECT * FROM pt;
leads to:
ERROR: XX000: no relation entry for relid 0
LOCATION: find_base_rel, relnode.c:541
Best regards,
Alexander