Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

Started by David Rowley3 months ago23 messages
#1David Rowley
dgrowleyml@gmail.com
1 attachment(s)

In [1]/messages/by-id/18904-c5fea7892f4d26ed@postgresql.org there was a report that set operations didn't correctly detect
when inputs were provably empty sets. While this is not the bug that
the report claimed it to be, as it's just a missing optimisation, I
did decide to look at it to check if there was much performance to
gain from doing this.

The short of it is, Yes, there are cases when this can help query
performance. Primarily, this seems to come from when the code detects
that an EXCEPT ALL has an empty right-hand input. In this case, we can
scan the left-hand input and forego the SetOp node completely.

With EXCEPT (without ALL), deduplication is still required, however
that can be done with an Aggregate node on the left input rather than
using the slightly less efficient SetOp node.

If I create two tables with a single int column containing 1 million
rows each, ANALYZE them and run some queries with and without the
patch, I see:

(work_mem=256MB, pgbench -M simple -T 10)

master:
EXCEPT ALL left dummy : tps = 8466.587802
EXCEPT ALL right dummy : tps = 3.160083
EXCEPT left dummy : tps = 8433.607519
EXCEPT right dummy : tps = 3.178104
INTERSECT (all types) : tps = 8392.695606
UNION left dummy : tps = 3.406355

patched:
EXCEPT ALL left dummy : tps = 8973.958896 (+5.99%)
EXCEPT ALL right dummy : tps = 53.583312 (+1595.63%)
EXCEPT left dummy : tps = 8736.716176 (+3.59%)
EXCEPT right dummy : tps = 3.385520 (+6.53%)
INTERSECT (all types) : tps = 8759.123942 (+4.37%)
UNION left dummy : tps = 3.590264 (+5.40%)

You can see EXCEPT ALL with the empty right-hand input became ~15x
faster, and all the others became ~5% faster.

There are some additional benefits aside from the performance as it's
possible to provide better row estimates in certain cases. For
example, if a UNION query removes all apart from 1 input, we can do
estimate_num_groups() on that input. Otherwise, we're left to the
assumption that all rows are unique, which certainly could cause some
trouble later in planning for queries consuming the results of set
operations in subqueries. EXCEPT with an empty right-hand input also
benefits from improved row estimates for the same reason.

For me, this seems worth doing. Set operations have been drawn out of
the dark ages with the last few releases, and I feel this makes them
more aligned to the set of optimisations we've come to expect in other
parts of the planner.

I'm happy to hear other opinions.

Patch attached.

David

[1]: /messages/by-id/18904-c5fea7892f4d26ed@postgresql.org

Attachments:

v1-0001-Teach-planner-to-short-circuit-plans-with-empty-s.patchapplication/octet-stream; name=v1-0001-Teach-planner-to-short-circuit-plans-with-empty-s.patchDownload
From bf45f8b4eafe776e5eb9b7806082b548cba2a9ca Mon Sep 17 00:00:00 2001
From: David Rowley <dgrowley@gmail.com>
Date: Wed, 30 Apr 2025 00:03:07 +1200
Subject: [PATCH v1] Teach planner to short-circuit plans with empty setop
 inputs

This adjusts UNION / INTERSECT / EXCEPT planning so that the planner
produces more optimal plans when one or more of the set operator's
subqueries has been proven to be empty (a dummy rel).

For UNION, if any of the inputs are empty, then that input can be removed
from the Append / MergeAppend.  Previously, a const-false "Result" node
would appear to represent this.  Removing empty inputs has a few extra
benefits when only 1 union child remains as it means the Append or
MergeAppend can be removed in setrefs.c and also we can provide better
n_distinct estimates by looking at the remaining input's statistics.

For INTERSECT, if either input is empty, the entire result is empty.
This is true regardless of INTERSECT or INTERSECT ALL.

For EXCEPT, if the left input is empty, then there's certainly no
result rows and there's no point in scanning the right input as there
are no rows to match those ones up to on the left input.  When EXCEPT's
right input is empty, we can simply return all rows on the left input
when it's EXCEPT ALL, otherwise, for EXCEPT without ALL, we must
de-duplicate the left input rows.  Row estimates for the latter should
also be better with this change as we can obtain n_distinct estimates
from the left input directly.

Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/18904-c5fea7892f4d26ed@postgresql.org
---
 src/backend/optimizer/path/joinrels.c  |   2 +
 src/backend/optimizer/prep/prepunion.c | 188 +++++++++++++++++++++++--
 src/test/regress/expected/union.out    | 120 ++++++++++++++++
 src/test/regress/sql/union.sql         |  59 ++++++++
 4 files changed, 360 insertions(+), 9 deletions(-)

diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 535248aa525..43f4051bc6e 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -1298,6 +1298,8 @@ is_dummy_rel(RelOptInfo *rel)
 			path = ((ProjectionPath *) path)->subpath;
 		else if (IsA(path, ProjectSetPath))
 			path = ((ProjectSetPath *) path)->subpath;
+		else if (IsA(path, SubqueryScanPath))
+			path = ((SubqueryScanPath *) path)->subpath;
 		else
 			break;
 	}
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 6bd0f4a5dc3..dc2bcf8efd0 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -763,6 +763,10 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 		RelOptInfo *rel = lfirst(lc);
 		Path	   *ordered_path;
 
+		/* Skip any UNION children that are proven not to yield any rows */
+		if (is_dummy_rel(rel))
+			continue;
+
 		cheapest_pathlist = lappend(cheapest_pathlist,
 									rel->cheapest_total_path);
 
@@ -812,6 +816,15 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	result_rel->consider_parallel = consider_parallel;
 	result_rel->consider_startup = (root->tuple_fraction > 0);
 
+	/* If all UNION children were dummy rels, make the resulting rel dummy */
+	if (cheapest_pathlist == NIL)
+	{
+		result_rel->reltarget = create_pathtarget(root, list_nth(tlist_list, 0));
+		mark_dummy_rel(result_rel);
+
+		return result_rel;
+	}
+
 	/*
 	 * Append the child results together using the cheapest paths from each
 	 * union child.
@@ -876,15 +889,33 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 		bool		can_sort = grouping_is_sortable(groupList);
 		bool		can_hash = grouping_is_hashable(groupList);
 
-		/*
-		 * XXX for the moment, take the number of distinct groups as equal to
-		 * the total input size, i.e., the worst case.  This is too
-		 * conservative, but it's not clear how to get a decent estimate of
-		 * the true size.  One should note as well the propensity of novices
-		 * to write UNION rather than UNION ALL even when they don't expect
-		 * any duplicates...
-		 */
-		dNumGroups = apath->rows;
+		if (list_length(cheapest_pathlist) == 1)
+		{
+			Path	   *path = linitial(cheapest_pathlist);
+
+			/*
+			 * In the case where only one union child remains due to the
+			 * detection of one or more dummy union children, we'll obtain an
+			 * estimate on the surviving child directly.
+			 */
+			dNumGroups = estimate_num_groups(root,
+											 path->pathtarget->exprs,
+											 path->rows,
+											 NULL,
+											 NULL);
+		}
+		else
+		{
+			/*
+			 * Otherwise, for the moment, take the number of distinct groups
+			 * as equal to the total input size, i.e., the worst case.  This
+			 * is too conservative, but it's not clear how to get a decent
+			 * estimate of the true size.  One should note as well the
+			 * propensity of novices to write UNION rather than UNION ALL even
+			 * when they don't expect any duplicates...
+			 */
+			dNumGroups = apath->rows;
+		}
 
 		if (can_hash)
 		{
@@ -1148,6 +1179,145 @@ generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *root,
 		result_rel->reltarget = create_setop_pathtarget(root, tlist,
 														list_make2(lpath, rpath));
 
+	/* Check for provably empty setop inputs and add short-circuit paths. */
+	if (op->op == SETOP_EXCEPT)
+	{
+		/*
+		 * For EXCEPTs, if the left side is dummy then there's no need to
+		 * inspect the right-hand side as scanning the right to find tuples to
+		 * remove won't make the left-hand input any more empty.
+		 */
+		if (is_dummy_rel(lrel))
+		{
+			result_rel->reltarget = create_pathtarget(root, lpath_tlist);
+			mark_dummy_rel(result_rel);
+
+			return result_rel;
+		}
+
+		/*
+		 * For EXCEPT, things are easy for the EXCEPT ALL case -- we can
+		 * simply scan the left-hand input.  For EXCEPT without ALL, we must
+		 * perform the work to deduplicate the results.
+		 */
+		if (is_dummy_rel(rrel))
+		{
+			if (op->all)
+			{
+				/*
+				 * EXCEPT ALL: If the right-hand input is dummy then we can
+				 * simply scan the left-hand input without further processing.
+				 */
+				add_path(result_rel, lpath);
+
+				return result_rel;
+			}
+			else
+			{
+				dNumGroups = estimate_num_groups(root,
+												 get_tlist_exprs(lpath_tlist, false),
+												 lrel->rows,
+												 NULL, NULL);
+
+				/*
+				 * EXCEPT (without ALL): Scan the left-hand input only.
+				 * Duplicates need to be removed, so an AggPath is needed.
+				 */
+				if (can_hash)
+				{
+					/*
+					 * Because the set op target list is made up with Vars
+					 * with varno==0 we must do something to allow the Agg to
+					 * find the Vars it expects for grouping.  In the normal
+					 * case, when there are no dummy setop inputs, the
+					 * SetOpPath would do this for us.  Here we make use of an
+					 * AppendPath putting the left input as the only Append
+					 * subpath.  This is a little bit of a hack, but the
+					 * single child Append doesn't cost us much as setrefs.c
+					 * will remove it later in planning.
+					 */
+					path = (Path *) create_append_path(root, result_rel, list_make1(lpath),
+													   NIL, NIL, NULL, 0, false, -1);
+
+					/* Hashed aggregate plan --- no sort needed */
+					path = (Path *) create_agg_path(root,
+													result_rel,
+													path,
+													result_rel->reltarget,
+													AGG_HASHED,
+													AGGSPLIT_SIMPLE,
+													groupList,
+													NIL,
+													NULL,
+													dNumGroups);
+					add_path(result_rel, path);
+				}
+
+				if (can_sort)
+				{
+					path = get_cheapest_path_for_pathkeys(lrel->pathlist,
+														  nonunion_pathkeys,
+														  NULL,
+														  TOTAL_COST,
+														  false);
+
+					if (path)
+					{
+						/*
+						 * As above, create a single-child Append for Var
+						 * translation purposes.
+						 */
+						path = (Path *) create_append_path(root, result_rel, list_make1(path),
+														   NIL, NIL, NULL, 0, false, -1);
+						path = (Path *)
+							create_unique_path(root,
+											   result_rel,
+											   path,
+											   list_length(path->pathkeys),
+											   dNumGroups);
+						add_path(result_rel, path);
+					}
+
+					/*
+					 * Also try sorting the cheapest Path as that might be
+					 * cheaper than the best pre-sorted Path.  Again, add the
+					 * single child Append for Var translation, as above.
+					 */
+					path = (Path *) create_append_path(root, result_rel, list_make1(lpath),
+													   NIL, NIL, NULL, 0, false, -1);
+
+					path = (Path *)
+						create_sort_path(root, result_rel, path,
+										 make_pathkeys_for_sortclauses(root, groupList, tlist),
+										 -1.0);
+					path = (Path *) create_unique_path(root,
+													   result_rel,
+													   path,
+													   list_length(path->pathkeys),
+													   dNumGroups);
+					add_path(result_rel, path);
+
+				}
+				return result_rel;
+			}
+		}
+	}
+	else
+	{
+		/*
+		 * For INTERSECT, if either input is a dummy rel then we can mark the
+		 * result_rel as dummy since intersecting with an empty relation can
+		 * never yield any results.  This is true regardless of INTERSECT or
+		 * INTERSECT ALL.
+		 */
+		if (is_dummy_rel(lrel) || is_dummy_rel(rrel))
+		{
+			result_rel->reltarget = create_pathtarget(root, lpath_tlist);
+			mark_dummy_rel(result_rel);
+			return result_rel;
+		}
+	}
+
 	/*
 	 * Estimate number of distinct groups that we'll need hashtable entries
 	 * for; this is the size of the left-hand input for EXCEPT, or the smaller
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index d3ea433db15..4637862fa44 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -1216,6 +1216,126 @@ select event_id
 
 drop table events_child, events, other_events;
 reset enable_indexonlyscan;
+--
+-- Test handling of UNION / EXCEPT / INTERSECT with provably empty inputs
+--
+-- Ensure the empty UNION input is pruned and de-duplication is done for the
+-- remaining relation.
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+UNION
+SELECT four FROM tenk1
+ORDER BY 1;
+              QUERY PLAN              
+--------------------------------------
+ Sort
+   Output: tenk1.four
+   Sort Key: tenk1.four
+   ->  HashAggregate
+         Output: tenk1.four
+         Group Key: tenk1.four
+         ->  Seq Scan on public.tenk1
+               Output: tenk1.four
+(8 rows)
+
+-- Validate that the results of the above are correct
+SELECT two FROM tenk1 WHERE 1=2
+UNION
+SELECT four FROM tenk1
+ORDER BY 1;
+ two 
+-----
+   0
+   1
+   2
+   3
+(4 rows)
+
+-- All UNION inputs are proven empty.  Ensure the planner provides a
+-- const-false Result node
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+UNION
+SELECT four FROM tenk1 WHERE 1=2
+UNION
+SELECT ten FROM tenk1 WHERE 1=2;
+           QUERY PLAN           
+--------------------------------
+ Result
+   Output: unnamed_subquery.two
+   Replaces: Aggregate
+   One-Time Filter: false
+(4 rows)
+
+-- Ensure the planner provides a const-false Result node
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+INTERSECT
+SELECT four FROM tenk1;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Result
+   Output: unnamed_subquery.two
+   Replaces: Aggregate on unnamed_subquery, unnamed_subquery_1
+   One-Time Filter: false
+(4 rows)
+
+-- As above, with the inputs swapped
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT four FROM tenk1
+INTERSECT
+SELECT two FROM tenk1 WHERE 1=2;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Result
+   Output: unnamed_subquery_1.two
+   Replaces: Aggregate on unnamed_subquery, unnamed_subquery_1
+   One-Time Filter: false
+(4 rows)
+
+-- Ensure the planner provides a const-false Result node when the left input
+-- is empty
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+EXCEPT
+SELECT four FROM tenk1;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Result
+   Output: unnamed_subquery.two
+   Replaces: Aggregate on unnamed_subquery, unnamed_subquery_1
+   One-Time Filter: false
+(4 rows)
+
+-- Ensure the planner only scans the left input and that it de-duplicates it
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1
+EXCEPT
+SELECT four FROM tenk1 WHERE 1=2
+ORDER BY 1;
+              QUERY PLAN              
+--------------------------------------
+ Sort
+   Output: tenk1.two
+   Sort Key: tenk1.two
+   ->  HashAggregate
+         Output: tenk1.two
+         Group Key: tenk1.two
+         ->  Seq Scan on public.tenk1
+               Output: tenk1.two
+(8 rows)
+
+-- Check the results of the above
+SELECT two FROM tenk1
+EXCEPT
+SELECT four FROM tenk1 WHERE 1=2
+ORDER BY 1;
+ two 
+-----
+   0
+   1
+(2 rows)
+
 -- Test constraint exclusion of UNION ALL subqueries
 explain (costs off)
  SELECT * FROM
diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql
index 13700a6bfc4..b9f432068a4 100644
--- a/src/test/regress/sql/union.sql
+++ b/src/test/regress/sql/union.sql
@@ -459,6 +459,65 @@ drop table events_child, events, other_events;
 
 reset enable_indexonlyscan;
 
+--
+-- Test handling of UNION / EXCEPT / INTERSECT with provably empty inputs
+--
+
+-- Ensure the empty UNION input is pruned and de-duplication is done for the
+-- remaining relation.
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+UNION
+SELECT four FROM tenk1
+ORDER BY 1;
+
+-- Validate that the results of the above are correct
+SELECT two FROM tenk1 WHERE 1=2
+UNION
+SELECT four FROM tenk1
+ORDER BY 1;
+
+-- All UNION inputs are proven empty.  Ensure the planner provides a
+-- const-false Result node
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+UNION
+SELECT four FROM tenk1 WHERE 1=2
+UNION
+SELECT ten FROM tenk1 WHERE 1=2;
+
+-- Ensure the planner provides a const-false Result node
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+INTERSECT
+SELECT four FROM tenk1;
+
+-- As above, with the inputs swapped
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT four FROM tenk1
+INTERSECT
+SELECT two FROM tenk1 WHERE 1=2;
+
+-- Ensure the planner provides a const-false Result node when the left input
+-- is empty
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+EXCEPT
+SELECT four FROM tenk1;
+
+-- Ensure the planner only scans the left input and that it de-duplicates it
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1
+EXCEPT
+SELECT four FROM tenk1 WHERE 1=2
+ORDER BY 1;
+
+-- Check the results of the above
+SELECT two FROM tenk1
+EXCEPT
+SELECT four FROM tenk1 WHERE 1=2
+ORDER BY 1;
+
 -- Test constraint exclusion of UNION ALL subqueries
 explain (costs off)
  SELECT * FROM
-- 
2.43.0

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#1)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

David Rowley <dgrowleyml@gmail.com> writes:

In [1] there was a report that set operations didn't correctly detect
when inputs were provably empty sets. While this is not the bug that
the report claimed it to be, as it's just a missing optimisation, I
did decide to look at it to check if there was much performance to
gain from doing this.

I'm kind of resistant to the amount of code this patch adds in
comparison to the likely benefit. Sure, a badly written query can
profit, but is it worth debugging and maintaining a couple hundred
lines of code for that?

The first few hunks of changes seem fine by this light, but I think
you're expending too much effort on the EXCEPT-with-dummy-inputs
cases. I'm wondering if it could be shortened a great deal by
handling left-input-dummy and EXCEPT-ALL-with-right-input-dummy
but leaving the EXCEPT-with-right-input-dummy case unimproved.

regards, tom lane

#3David Rowley
dgrowleyml@gmail.com
In reply to: Tom Lane (#2)
2 attachment(s)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

On Thu, 2 Oct 2025 at 16:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm wondering if it could be shortened a great deal by
handling left-input-dummy and EXCEPT-ALL-with-right-input-dummy
but leaving the EXCEPT-with-right-input-dummy case unimproved.

Good idea. Less code and still get to keep the one that did well in
the benchmark. See attached.

I ended up splitting the patch in two. 0001 for UNION only, then 0002
for the INTERSECT and EXCEPT.

David

Attachments:

v2-0001-Teach-UNION-planner-to-remove-dummy-inputs.patchapplication/octet-stream; name=v2-0001-Teach-UNION-planner-to-remove-dummy-inputs.patchDownload
From a20eb281e8bc998a2a4ebb70b13d223093b04f71 Mon Sep 17 00:00:00 2001
From: David Rowley <dgrowley@gmail.com>
Date: Wed, 30 Apr 2025 00:03:07 +1200
Subject: [PATCH v2 1/2] Teach UNION planner to remove dummy inputs

This adjusts UNION planning so that the planner produces more optimal
plans when one or more of the UNION's subqueries has been proven to be
empty (a dummy rel).

If any of the inputs are empty, then that input can be removed from the
Append / MergeAppend.  Previously, a const-false "Result" node would
appear to represent this.  Removing empty inputs has a few extra
benefits when only 1 union child remains as it means the Append or
MergeAppend can be removed in setrefs.c and also we can provide better
n_distinct estimates by looking at the remaining input's statistics.

Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvri53PPF76c3M94_QNWbJfXjyCnjXuj_2=LYM-0m8WZtw@mail.gmail.com
---
 src/backend/optimizer/path/joinrels.c  |  2 +
 src/backend/optimizer/prep/prepunion.c | 49 ++++++++++++++++++++-----
 src/test/regress/expected/union.out    | 51 ++++++++++++++++++++++++++
 src/test/regress/sql/union.sql         | 27 ++++++++++++++
 4 files changed, 120 insertions(+), 9 deletions(-)

diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 535248aa525..43f4051bc6e 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -1298,6 +1298,8 @@ is_dummy_rel(RelOptInfo *rel)
 			path = ((ProjectionPath *) path)->subpath;
 		else if (IsA(path, ProjectSetPath))
 			path = ((ProjectSetPath *) path)->subpath;
+		else if (IsA(path, SubqueryScanPath))
+			path = ((SubqueryScanPath *) path)->subpath;
 		else
 			break;
 	}
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 6bd0f4a5dc3..3314e6d7153 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -763,6 +763,10 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 		RelOptInfo *rel = lfirst(lc);
 		Path	   *ordered_path;
 
+		/* Skip any UNION children that are proven not to yield any rows */
+		if (is_dummy_rel(rel))
+			continue;
+
 		cheapest_pathlist = lappend(cheapest_pathlist,
 									rel->cheapest_total_path);
 
@@ -812,6 +816,15 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	result_rel->consider_parallel = consider_parallel;
 	result_rel->consider_startup = (root->tuple_fraction > 0);
 
+	/* If all UNION children were dummy rels, make the resulting rel dummy */
+	if (cheapest_pathlist == NIL)
+	{
+		result_rel->reltarget = create_pathtarget(root, list_nth(tlist_list, 0));
+		mark_dummy_rel(result_rel);
+
+		return result_rel;
+	}
+
 	/*
 	 * Append the child results together using the cheapest paths from each
 	 * union child.
@@ -876,15 +889,33 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 		bool		can_sort = grouping_is_sortable(groupList);
 		bool		can_hash = grouping_is_hashable(groupList);
 
-		/*
-		 * XXX for the moment, take the number of distinct groups as equal to
-		 * the total input size, i.e., the worst case.  This is too
-		 * conservative, but it's not clear how to get a decent estimate of
-		 * the true size.  One should note as well the propensity of novices
-		 * to write UNION rather than UNION ALL even when they don't expect
-		 * any duplicates...
-		 */
-		dNumGroups = apath->rows;
+		if (list_length(cheapest_pathlist) == 1)
+		{
+			Path	   *path = linitial(cheapest_pathlist);
+
+			/*
+			 * In the case where only one union child remains due to the
+			 * detection of one or more dummy union children, we'll obtain an
+			 * estimate on the surviving child directly.
+			 */
+			dNumGroups = estimate_num_groups(root,
+											 path->pathtarget->exprs,
+											 path->rows,
+											 NULL,
+											 NULL);
+		}
+		else
+		{
+			/*
+			 * Otherwise, for the moment, take the number of distinct groups
+			 * as equal to the total input size, i.e., the worst case.  This
+			 * is too conservative, but it's not clear how to get a decent
+			 * estimate of the true size.  One should note as well the
+			 * propensity of novices to write UNION rather than UNION ALL even
+			 * when they don't expect any duplicates...
+			 */
+			dNumGroups = apath->rows;
+		}
 
 		if (can_hash)
 		{
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index d3ea433db15..7c089e0d598 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -1216,6 +1216,57 @@ select event_id
 
 drop table events_child, events, other_events;
 reset enable_indexonlyscan;
+--
+-- Test handling of UNION with provably empty inputs
+--
+-- Ensure the empty UNION input is pruned and de-duplication is done for the
+-- remaining relation.
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+UNION
+SELECT four FROM tenk1
+ORDER BY 1;
+              QUERY PLAN              
+--------------------------------------
+ Sort
+   Output: tenk1.four
+   Sort Key: tenk1.four
+   ->  HashAggregate
+         Output: tenk1.four
+         Group Key: tenk1.four
+         ->  Seq Scan on public.tenk1
+               Output: tenk1.four
+(8 rows)
+
+-- Validate that the results of the above are correct
+SELECT two FROM tenk1 WHERE 1=2
+UNION
+SELECT four FROM tenk1
+ORDER BY 1;
+ two 
+-----
+   0
+   1
+   2
+   3
+(4 rows)
+
+-- All UNION inputs are proven empty.  Ensure the planner provides a
+-- const-false Result node
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+UNION
+SELECT four FROM tenk1 WHERE 1=2
+UNION
+SELECT ten FROM tenk1 WHERE 1=2;
+           QUERY PLAN           
+--------------------------------
+ Result
+   Output: unnamed_subquery.two
+   Replaces: Aggregate
+   One-Time Filter: false
+(4 rows)
+
 -- Test constraint exclusion of UNION ALL subqueries
 explain (costs off)
  SELECT * FROM
diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql
index 13700a6bfc4..56bd20e741c 100644
--- a/src/test/regress/sql/union.sql
+++ b/src/test/regress/sql/union.sql
@@ -459,6 +459,33 @@ drop table events_child, events, other_events;
 
 reset enable_indexonlyscan;
 
+--
+-- Test handling of UNION with provably empty inputs
+--
+
+-- Ensure the empty UNION input is pruned and de-duplication is done for the
+-- remaining relation.
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+UNION
+SELECT four FROM tenk1
+ORDER BY 1;
+
+-- Validate that the results of the above are correct
+SELECT two FROM tenk1 WHERE 1=2
+UNION
+SELECT four FROM tenk1
+ORDER BY 1;
+
+-- All UNION inputs are proven empty.  Ensure the planner provides a
+-- const-false Result node
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+UNION
+SELECT four FROM tenk1 WHERE 1=2
+UNION
+SELECT ten FROM tenk1 WHERE 1=2;
+
 -- Test constraint exclusion of UNION ALL subqueries
 explain (costs off)
  SELECT * FROM
-- 
2.43.0

v2-0002-Teach-planner-to-short-circuit-EXCEPT-INTERSECT-w.patchapplication/octet-stream; name=v2-0002-Teach-planner-to-short-circuit-EXCEPT-INTERSECT-w.patchDownload
From b82667f5546d6292571f09f754d37245b2db4f27 Mon Sep 17 00:00:00 2001
From: David Rowley <dgrowley@gmail.com>
Date: Thu, 2 Oct 2025 23:38:46 +1300
Subject: [PATCH v2 2/2] Teach planner to short-circuit EXCEPT/INTERSECT with
 dummy inputs

When either inputs of an INTERSECT operator are proven not to return any
results (a dummy rel), then mark the entire INTERSECT operation as dummy.

Likewise, if an EXCEPT operation's left input is proven empty, then mark
the entire operation as dummy.

With EXCEPT ALL, we can easily handle the right input being dummy as
we can return the left input without any processing.  The same isn't
true for EXCEPT (without ALL), as that would require deduplication of
the left input.  Wiring up those Paths is likely more complex than it's
worth, so let's leave that one alone.

Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvri53PPF76c3M94_QNWbJfXjyCnjXuj_2=LYM-0m8WZtw@mail.gmail.com
---
 src/backend/optimizer/prep/prepunion.c | 64 ++++++++++++++++++++++++++
 src/test/regress/expected/union.out    | 57 ++++++++++++++++++++++-
 src/test/regress/sql/union.sql         | 28 ++++++++++-
 3 files changed, 147 insertions(+), 2 deletions(-)

diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 3314e6d7153..e7790851b26 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -1179,6 +1179,70 @@ generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *root,
 		result_rel->reltarget = create_setop_pathtarget(root, tlist,
 														list_make2(lpath, rpath));
 
+	/* Check for provably empty setop inputs and add short-circuit paths. */
+	if (op->op == SETOP_EXCEPT)
+	{
+		/*
+		 * For EXCEPTs, if the left side is dummy then there's no need to
+		 * inspect the right-hand side as scanning the right to find tuples to
+		 * remove won't make the left-hand input any more empty.
+		 */
+		if (is_dummy_rel(lrel))
+		{
+			result_rel->reltarget = create_pathtarget(root, lpath_tlist);
+			mark_dummy_rel(result_rel);
+
+			return result_rel;
+		}
+
+		/* Handle EXCEPTs with dummy right input */
+		if (is_dummy_rel(rrel))
+		{
+			if (op->all)
+			{
+				Path	   *apath;
+
+				/*
+				 * EXCEPT ALL: If the right-hand input is dummy then we can
+				 * simply scan the left-hand input.  To keep createplan.c
+				 * happy, use a single child Append to handle the translation
+				 * between the set op targetlist and the targetlist of the
+				 * left input.  The Append will be removed in setrefs.c.
+				 */
+				apath = (Path *) create_append_path(root, result_rel, list_make1(lpath),
+													NIL, NIL, NULL, 0, false, -1);
+
+				add_path(result_rel, apath);
+
+				return result_rel;
+			}
+			else
+			{
+				/*
+				 * To make EXCEPT with a dummy RHS work means having to
+				 * deduplicate the left input.  That could be done with
+				 * AggPaths, but doesn't seem worth the effort.  Let the
+				 * normal path generation code below handle this one.
+				 */
+			}
+		}
+	}
+	else
+	{
+		/*
+		 * For INTERSECT, if either input is a dummy rel then we can mark the
+		 * result_rel as dummy since intersecting with an empty relation can
+		 * never yield any results.  This is true regardless of INTERSECT or
+		 * INTERSECT ALL.
+		 */
+		if (is_dummy_rel(lrel) || is_dummy_rel(rrel))
+		{
+			result_rel->reltarget = create_pathtarget(root, lpath_tlist);
+			mark_dummy_rel(result_rel);
+			return result_rel;
+		}
+	}
+
 	/*
 	 * Estimate number of distinct groups that we'll need hashtable entries
 	 * for; this is the size of the left-hand input for EXCEPT, or the smaller
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 7c089e0d598..6856583c165 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -1217,7 +1217,7 @@ select event_id
 drop table events_child, events, other_events;
 reset enable_indexonlyscan;
 --
--- Test handling of UNION with provably empty inputs
+-- Test handling of UNION / EXCEPT / INTERSECT with provably empty inputs
 --
 -- Ensure the empty UNION input is pruned and de-duplication is done for the
 -- remaining relation.
@@ -1267,6 +1267,61 @@ SELECT ten FROM tenk1 WHERE 1=2;
    One-Time Filter: false
 (4 rows)
 
+-- Ensure the planner provides a const-false Result node
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+INTERSECT
+SELECT four FROM tenk1;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Result
+   Output: unnamed_subquery.two
+   Replaces: Aggregate on unnamed_subquery, unnamed_subquery_1
+   One-Time Filter: false
+(4 rows)
+
+-- As above, with the inputs swapped
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT four FROM tenk1
+INTERSECT
+SELECT two FROM tenk1 WHERE 1=2;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Result
+   Output: unnamed_subquery_1.two
+   Replaces: Aggregate on unnamed_subquery, unnamed_subquery_1
+   One-Time Filter: false
+(4 rows)
+
+-- Ensure the planner provides a const-false Result node when the left input
+-- is empty
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+EXCEPT
+SELECT four FROM tenk1;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Result
+   Output: unnamed_subquery.two
+   Replaces: Aggregate on unnamed_subquery, unnamed_subquery_1
+   One-Time Filter: false
+(4 rows)
+
+-- Ensure the planner only scans the left input
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1
+EXCEPT ALL
+SELECT four FROM tenk1 WHERE 1=2
+ORDER BY 1;
+           QUERY PLAN           
+--------------------------------
+ Sort
+   Output: tenk1.two
+   Sort Key: tenk1.two
+   ->  Seq Scan on public.tenk1
+         Output: tenk1.two
+(5 rows)
+
 -- Test constraint exclusion of UNION ALL subqueries
 explain (costs off)
  SELECT * FROM
diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql
index 56bd20e741c..e66239931cd 100644
--- a/src/test/regress/sql/union.sql
+++ b/src/test/regress/sql/union.sql
@@ -460,7 +460,7 @@ drop table events_child, events, other_events;
 reset enable_indexonlyscan;
 
 --
--- Test handling of UNION with provably empty inputs
+-- Test handling of UNION / EXCEPT / INTERSECT with provably empty inputs
 --
 
 -- Ensure the empty UNION input is pruned and de-duplication is done for the
@@ -486,6 +486,32 @@ SELECT four FROM tenk1 WHERE 1=2
 UNION
 SELECT ten FROM tenk1 WHERE 1=2;
 
+-- Ensure the planner provides a const-false Result node
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+INTERSECT
+SELECT four FROM tenk1;
+
+-- As above, with the inputs swapped
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT four FROM tenk1
+INTERSECT
+SELECT two FROM tenk1 WHERE 1=2;
+
+-- Ensure the planner provides a const-false Result node when the left input
+-- is empty
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+EXCEPT
+SELECT four FROM tenk1;
+
+-- Ensure the planner only scans the left input
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1
+EXCEPT ALL
+SELECT four FROM tenk1 WHERE 1=2
+ORDER BY 1;
+
 -- Test constraint exclusion of UNION ALL subqueries
 explain (costs off)
  SELECT * FROM
-- 
2.43.0

#4Mingli Zhang
zmlpostgres@gmail.com
In reply to: David Rowley (#3)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

Hi,

David Rowley <dgrowleyml@gmail.com>于2025年10月2日 周四20:09写道:

On Thu, 2 Oct 2025 at 16:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm wondering if it could be shortened a great deal by
handling left-input-dummy and EXCEPT-ALL-with-right-input-dummy
but leaving the EXCEPT-with-right-input-dummy case unimproved.

Good idea. Less code and still get to keep the one that did well in
the benchmark. See attached.

I ended up splitting the patch in two. 0001 for UNION only, then 0002
for the INTERSECT and EXCEPT.

David

It seems that the optimization for `UNION ALL` is already implemented in
the patch: it removes empty sub-paths and preserves the remaining ones.
Should we add a test case to formally validate this behavior like Union
cases?

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#3)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

David Rowley <dgrowleyml@gmail.com> writes:

Good idea. Less code and still get to keep the one that did well in
the benchmark. See attached.

0001's change in is_dummy_rel() seems like a complete hack, especially
since you didn't bother to change the adjacent comment. Why is that
necessary?

v2 LGTM otherwise.

regards, tom lane

#6David Rowley
dgrowleyml@gmail.com
In reply to: Tom Lane (#5)
2 attachment(s)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

On Fri, 3 Oct 2025 at 04:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:

0001's change in is_dummy_rel() seems like a complete hack, especially
since you didn't bother to change the adjacent comment. Why is that
necessary?

build_setop_child_paths() wraps the child inputs in SubqueryScanPaths,
so we need to see through those.

An alternative way would be to propagate those during build_setop_child_paths()

--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -523,6 +523,13 @@ build_setop_child_paths(PlannerInfo *root, RelOptInfo *rel,
  bool is_sorted;
  int presorted_keys;
+     /* If the input rel is dummy, propagate that to this query level */
+     if (is_dummy_rel(final_rel))
+     {
+         mark_dummy_rel(rel);
+         continue;
+     }
+

As attached.

v2 LGTM otherwise.

Thanks

David

Attachments:

v3-0001-Teach-UNION-planner-to-remove-dummy-inputs.patchapplication/octet-stream; name=v3-0001-Teach-UNION-planner-to-remove-dummy-inputs.patchDownload
From d9f3a8b7e6b9cab30731e48b518bf6cc33dd041d Mon Sep 17 00:00:00 2001
From: David Rowley <dgrowley@gmail.com>
Date: Wed, 30 Apr 2025 00:03:07 +1200
Subject: [PATCH v3 1/2] Teach UNION planner to remove dummy inputs

This adjusts UNION planning so that the planner produces more optimal
plans when one or more of the UNION's subqueries has been proven to be
empty (a dummy rel).

If any of the inputs are empty, then that input can be removed from the
Append / MergeAppend.  Previously, a const-false "Result" node would
appear to represent this.  Removing empty inputs has a few extra
benefits when only 1 union child remains as it means the Append or
MergeAppend can be removed in setrefs.c and also we can provide better
n_distinct estimates by looking at the remaining input's statistics.

Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvri53PPF76c3M94_QNWbJfXjyCnjXuj_2=LYM-0m8WZtw@mail.gmail.com
---
 src/backend/optimizer/prep/prepunion.c | 56 +++++++++++++++++++++-----
 src/test/regress/expected/union.out    | 51 +++++++++++++++++++++++
 src/test/regress/sql/union.sql         | 27 +++++++++++++
 3 files changed, 125 insertions(+), 9 deletions(-)

diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 6bd0f4a5dc3..bf10988cd2d 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -523,6 +523,13 @@ build_setop_child_paths(PlannerInfo *root, RelOptInfo *rel,
 		bool		is_sorted;
 		int			presorted_keys;
 
+		/* If the input rel is dummy, propagate that to this query level */
+		if (is_dummy_rel(final_rel))
+		{
+			mark_dummy_rel(rel);
+			continue;
+		}
+
 		/*
 		 * Include the cheapest path as-is so that the set operation can be
 		 * cheaply implemented using a method which does not require the input
@@ -763,6 +770,10 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 		RelOptInfo *rel = lfirst(lc);
 		Path	   *ordered_path;
 
+		/* Skip any UNION children that are proven not to yield any rows */
+		if (is_dummy_rel(rel))
+			continue;
+
 		cheapest_pathlist = lappend(cheapest_pathlist,
 									rel->cheapest_total_path);
 
@@ -812,6 +823,15 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	result_rel->consider_parallel = consider_parallel;
 	result_rel->consider_startup = (root->tuple_fraction > 0);
 
+	/* If all UNION children were dummy rels, make the resulting rel dummy */
+	if (cheapest_pathlist == NIL)
+	{
+		result_rel->reltarget = create_pathtarget(root, list_nth(tlist_list, 0));
+		mark_dummy_rel(result_rel);
+
+		return result_rel;
+	}
+
 	/*
 	 * Append the child results together using the cheapest paths from each
 	 * union child.
@@ -876,15 +896,33 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 		bool		can_sort = grouping_is_sortable(groupList);
 		bool		can_hash = grouping_is_hashable(groupList);
 
-		/*
-		 * XXX for the moment, take the number of distinct groups as equal to
-		 * the total input size, i.e., the worst case.  This is too
-		 * conservative, but it's not clear how to get a decent estimate of
-		 * the true size.  One should note as well the propensity of novices
-		 * to write UNION rather than UNION ALL even when they don't expect
-		 * any duplicates...
-		 */
-		dNumGroups = apath->rows;
+		if (list_length(cheapest_pathlist) == 1)
+		{
+			Path	   *path = linitial(cheapest_pathlist);
+
+			/*
+			 * In the case where only one union child remains due to the
+			 * detection of one or more dummy union children, we'll obtain an
+			 * estimate on the surviving child directly.
+			 */
+			dNumGroups = estimate_num_groups(root,
+											 path->pathtarget->exprs,
+											 path->rows,
+											 NULL,
+											 NULL);
+		}
+		else
+		{
+			/*
+			 * Otherwise, for the moment, take the number of distinct groups
+			 * as equal to the total input size, i.e., the worst case.  This
+			 * is too conservative, but it's not clear how to get a decent
+			 * estimate of the true size.  One should note as well the
+			 * propensity of novices to write UNION rather than UNION ALL even
+			 * when they don't expect any duplicates...
+			 */
+			dNumGroups = apath->rows;
+		}
 
 		if (can_hash)
 		{
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index d3ea433db15..7c089e0d598 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -1216,6 +1216,57 @@ select event_id
 
 drop table events_child, events, other_events;
 reset enable_indexonlyscan;
+--
+-- Test handling of UNION with provably empty inputs
+--
+-- Ensure the empty UNION input is pruned and de-duplication is done for the
+-- remaining relation.
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+UNION
+SELECT four FROM tenk1
+ORDER BY 1;
+              QUERY PLAN              
+--------------------------------------
+ Sort
+   Output: tenk1.four
+   Sort Key: tenk1.four
+   ->  HashAggregate
+         Output: tenk1.four
+         Group Key: tenk1.four
+         ->  Seq Scan on public.tenk1
+               Output: tenk1.four
+(8 rows)
+
+-- Validate that the results of the above are correct
+SELECT two FROM tenk1 WHERE 1=2
+UNION
+SELECT four FROM tenk1
+ORDER BY 1;
+ two 
+-----
+   0
+   1
+   2
+   3
+(4 rows)
+
+-- All UNION inputs are proven empty.  Ensure the planner provides a
+-- const-false Result node
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+UNION
+SELECT four FROM tenk1 WHERE 1=2
+UNION
+SELECT ten FROM tenk1 WHERE 1=2;
+           QUERY PLAN           
+--------------------------------
+ Result
+   Output: unnamed_subquery.two
+   Replaces: Aggregate
+   One-Time Filter: false
+(4 rows)
+
 -- Test constraint exclusion of UNION ALL subqueries
 explain (costs off)
  SELECT * FROM
diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql
index 13700a6bfc4..56bd20e741c 100644
--- a/src/test/regress/sql/union.sql
+++ b/src/test/regress/sql/union.sql
@@ -459,6 +459,33 @@ drop table events_child, events, other_events;
 
 reset enable_indexonlyscan;
 
+--
+-- Test handling of UNION with provably empty inputs
+--
+
+-- Ensure the empty UNION input is pruned and de-duplication is done for the
+-- remaining relation.
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+UNION
+SELECT four FROM tenk1
+ORDER BY 1;
+
+-- Validate that the results of the above are correct
+SELECT two FROM tenk1 WHERE 1=2
+UNION
+SELECT four FROM tenk1
+ORDER BY 1;
+
+-- All UNION inputs are proven empty.  Ensure the planner provides a
+-- const-false Result node
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+UNION
+SELECT four FROM tenk1 WHERE 1=2
+UNION
+SELECT ten FROM tenk1 WHERE 1=2;
+
 -- Test constraint exclusion of UNION ALL subqueries
 explain (costs off)
  SELECT * FROM
-- 
2.43.0

v3-0002-Teach-planner-to-short-circuit-EXCEPT-INTERSECT-w.patchapplication/octet-stream; name=v3-0002-Teach-planner-to-short-circuit-EXCEPT-INTERSECT-w.patchDownload
From 89b88becf88ef4ea91aca161d26649396fa8dac5 Mon Sep 17 00:00:00 2001
From: David Rowley <dgrowley@gmail.com>
Date: Thu, 2 Oct 2025 23:38:46 +1300
Subject: [PATCH v3 2/2] Teach planner to short-circuit EXCEPT/INTERSECT with
 dummy inputs

When either inputs of an INTERSECT operator are proven not to return any
results (a dummy rel), then mark the entire INTERSECT operation as dummy.

Likewise, if an EXCEPT operation's left input is proven empty, then mark
the entire operation as dummy.

With EXCEPT ALL, we can easily handle the right input being dummy as
we can return the left input without any processing.  The same isn't
true for EXCEPT (without ALL), as that would require deduplication of
the left input.  Wiring up those Paths is likely more complex than it's
worth, so let's leave that one alone.

Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvri53PPF76c3M94_QNWbJfXjyCnjXuj_2=LYM-0m8WZtw@mail.gmail.com
---
 src/backend/optimizer/prep/prepunion.c | 64 ++++++++++++++++++++++++++
 src/test/regress/expected/union.out    | 57 ++++++++++++++++++++++-
 src/test/regress/sql/union.sql         | 28 ++++++++++-
 3 files changed, 147 insertions(+), 2 deletions(-)

diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index bf10988cd2d..675f7d60d13 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -1186,6 +1186,70 @@ generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *root,
 		result_rel->reltarget = create_setop_pathtarget(root, tlist,
 														list_make2(lpath, rpath));
 
+	/* Check for provably empty setop inputs and add short-circuit paths. */
+	if (op->op == SETOP_EXCEPT)
+	{
+		/*
+		 * For EXCEPTs, if the left side is dummy then there's no need to
+		 * inspect the right-hand side as scanning the right to find tuples to
+		 * remove won't make the left-hand input any more empty.
+		 */
+		if (is_dummy_rel(lrel))
+		{
+			result_rel->reltarget = create_pathtarget(root, lpath_tlist);
+			mark_dummy_rel(result_rel);
+
+			return result_rel;
+		}
+
+		/* Handle EXCEPTs with dummy right input */
+		if (is_dummy_rel(rrel))
+		{
+			if (op->all)
+			{
+				Path	   *apath;
+
+				/*
+				 * EXCEPT ALL: If the right-hand input is dummy then we can
+				 * simply scan the left-hand input.  To keep createplan.c
+				 * happy, use a single child Append to handle the translation
+				 * between the set op targetlist and the targetlist of the
+				 * left input.  The Append will be removed in setrefs.c.
+				 */
+				apath = (Path *) create_append_path(root, result_rel, list_make1(lpath),
+													NIL, NIL, NULL, 0, false, -1);
+
+				add_path(result_rel, apath);
+
+				return result_rel;
+			}
+			else
+			{
+				/*
+				 * To make EXCEPT with a dummy RHS work means having to
+				 * deduplicate the left input.  That could be done with
+				 * AggPaths, but doesn't seem worth the effort.  Let the
+				 * normal path generation code below handle this one.
+				 */
+			}
+		}
+	}
+	else
+	{
+		/*
+		 * For INTERSECT, if either input is a dummy rel then we can mark the
+		 * result_rel as dummy since intersecting with an empty relation can
+		 * never yield any results.  This is true regardless of INTERSECT or
+		 * INTERSECT ALL.
+		 */
+		if (is_dummy_rel(lrel) || is_dummy_rel(rrel))
+		{
+			result_rel->reltarget = create_pathtarget(root, lpath_tlist);
+			mark_dummy_rel(result_rel);
+			return result_rel;
+		}
+	}
+
 	/*
 	 * Estimate number of distinct groups that we'll need hashtable entries
 	 * for; this is the size of the left-hand input for EXCEPT, or the smaller
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 7c089e0d598..6856583c165 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -1217,7 +1217,7 @@ select event_id
 drop table events_child, events, other_events;
 reset enable_indexonlyscan;
 --
--- Test handling of UNION with provably empty inputs
+-- Test handling of UNION / EXCEPT / INTERSECT with provably empty inputs
 --
 -- Ensure the empty UNION input is pruned and de-duplication is done for the
 -- remaining relation.
@@ -1267,6 +1267,61 @@ SELECT ten FROM tenk1 WHERE 1=2;
    One-Time Filter: false
 (4 rows)
 
+-- Ensure the planner provides a const-false Result node
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+INTERSECT
+SELECT four FROM tenk1;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Result
+   Output: unnamed_subquery.two
+   Replaces: Aggregate on unnamed_subquery, unnamed_subquery_1
+   One-Time Filter: false
+(4 rows)
+
+-- As above, with the inputs swapped
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT four FROM tenk1
+INTERSECT
+SELECT two FROM tenk1 WHERE 1=2;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Result
+   Output: unnamed_subquery_1.two
+   Replaces: Aggregate on unnamed_subquery, unnamed_subquery_1
+   One-Time Filter: false
+(4 rows)
+
+-- Ensure the planner provides a const-false Result node when the left input
+-- is empty
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+EXCEPT
+SELECT four FROM tenk1;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Result
+   Output: unnamed_subquery.two
+   Replaces: Aggregate on unnamed_subquery, unnamed_subquery_1
+   One-Time Filter: false
+(4 rows)
+
+-- Ensure the planner only scans the left input
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1
+EXCEPT ALL
+SELECT four FROM tenk1 WHERE 1=2
+ORDER BY 1;
+           QUERY PLAN           
+--------------------------------
+ Sort
+   Output: tenk1.two
+   Sort Key: tenk1.two
+   ->  Seq Scan on public.tenk1
+         Output: tenk1.two
+(5 rows)
+
 -- Test constraint exclusion of UNION ALL subqueries
 explain (costs off)
  SELECT * FROM
diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql
index 56bd20e741c..e66239931cd 100644
--- a/src/test/regress/sql/union.sql
+++ b/src/test/regress/sql/union.sql
@@ -460,7 +460,7 @@ drop table events_child, events, other_events;
 reset enable_indexonlyscan;
 
 --
--- Test handling of UNION with provably empty inputs
+-- Test handling of UNION / EXCEPT / INTERSECT with provably empty inputs
 --
 
 -- Ensure the empty UNION input is pruned and de-duplication is done for the
@@ -486,6 +486,32 @@ SELECT four FROM tenk1 WHERE 1=2
 UNION
 SELECT ten FROM tenk1 WHERE 1=2;
 
+-- Ensure the planner provides a const-false Result node
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+INTERSECT
+SELECT four FROM tenk1;
+
+-- As above, with the inputs swapped
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT four FROM tenk1
+INTERSECT
+SELECT two FROM tenk1 WHERE 1=2;
+
+-- Ensure the planner provides a const-false Result node when the left input
+-- is empty
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+EXCEPT
+SELECT four FROM tenk1;
+
+-- Ensure the planner only scans the left input
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1
+EXCEPT ALL
+SELECT four FROM tenk1 WHERE 1=2
+ORDER BY 1;
+
 -- Test constraint exclusion of UNION ALL subqueries
 explain (costs off)
  SELECT * FROM
-- 
2.43.0

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#6)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

David Rowley <dgrowleyml@gmail.com> writes:

On Fri, 3 Oct 2025 at 04:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:

0001's change in is_dummy_rel() seems like a complete hack, especially
since you didn't bother to change the adjacent comment. Why is that
necessary?

build_setop_child_paths() wraps the child inputs in SubqueryScanPaths,
so we need to see through those.

Ah.

An alternative way would be to propagate those during build_setop_child_paths()

That answer works for me. I was expecting you to just document the
need for the extra check in is_dummy_rel ;-) ... but this way is
perhaps better.

regards, tom lane

#8David Rowley
dgrowleyml@gmail.com
In reply to: Mingli Zhang (#4)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

On Fri, 3 Oct 2025 at 02:55, Mingli Zhang <zmlpostgres@gmail.com> wrote:

It seems that the optimization for `UNION ALL` is already implemented in the patch: it removes empty sub-paths and preserves the remaining ones.
Should we add a test case to formally validate this behavior like Union cases?

If I were to do that, I'd have to come up with something that's
flatten_simple_union_all() proof. Maybe something like varying types
in the targetlist. I think it's probably not really worthwhile since
it's not testing any new code that is not already being tested by the
tests that I did add.

David

#9David Rowley
dgrowleyml@gmail.com
In reply to: Tom Lane (#7)
2 attachment(s)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

On Fri, 3 Oct 2025 at 10:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Rowley <dgrowleyml@gmail.com> writes:

An alternative way would be to propagate those during build_setop_child_paths()

That answer works for me. I was expecting you to just document the
need for the extra check in is_dummy_rel ;-) ... but this way is
perhaps better.

So, I pushed the UNION portion earlier, but on hacking more on the
EXCEPT/INTERSECT patch, I noticed that I don't have the target lists
correct when marking the top-level set op as dummy. I had thought it
was ok to use the target list of the first child. I did that to make
EXPLAIN VERBOSE work as it chokes on the varno==0 top-level setop
targetlist as generated by generate_append_tlist(). However, using the
target list of the first child isn't right as createplan will choke on
not finding PathKeys to sort.

It looks like the normal UNION case side steps this issue for T_SetOp
by applying set_dummy_tlist_references() in setrefs.c. That doesn't
happen for Result since we may have something to evaluate there.

I'm just considering the best fix and can think of two options:

1) Move away from using varno==0 in generate_append_tlist(). Use varno==1, or;
2) Add handling in setrefs.c for T_Result to adjust varno==0 Vars to
use varno==1 vars.

The attached v4-0001 does #2, but wondering if #1 should be explored first.

David

Attachments:

v4-0001-Fix-incorrect-use-of-targetlist-in-dummy-UNIONs.patchapplication/octet-stream; name=v4-0001-Fix-incorrect-use-of-targetlist-in-dummy-UNIONs.patchDownload
From 44bbaa642fe09f38fb0d49434b236ac5c1621a3d Mon Sep 17 00:00:00 2001
From: David Rowley <dgrowley@gmail.com>
Date: Sat, 4 Oct 2025 16:26:32 +1300
Subject: [PATCH v4 1/2] Fix incorrect use of targetlist in dummy UNIONs

I had thought that using the targetlist of the first UNION child was ok,
but it's not as this could cause:

ERROR:  could not find pathkey item to sort

Instead, use the top-level UNION's targetlist and fix setrefs.c so it
rewrites the varno==0 Vars into varno==1 vars so that we display the
targetlist item from the first UNION child.  If we didn't do this,
EXPLAIN VERBOSE would give us something like:

ERROR:  bogus varno: 0
---
 src/backend/optimizer/plan/setrefs.c   | 24 ++++++++++++++++++++----
 src/backend/optimizer/prep/prepunion.c |  1 -
 src/test/regress/expected/union.out    | 18 +++++++++++-------
 src/test/regress/sql/union.sql         |  3 ++-
 4 files changed, 33 insertions(+), 13 deletions(-)

diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 6950eff2c5b..10d0cc4c2d3 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1034,16 +1034,32 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
 					 * expected to occur here, it seems safer to special-case
 					 * it here and keep the assertions that ROWID_VARs
 					 * shouldn't be seen by fix_scan_expr.
+					 *
+					 * We also must handle the case where set operations have
+					 * been short-circuited resulting in a dummy Result node.
+					 * prepunion.c uses varno==0 for the set op targetlist.
+					 * Rewrite these to use varno==1.  Otherwise EXPLAIN will
+					 * have trouble displaying targetlists.
 					 */
 					foreach(l, splan->plan.targetlist)
 					{
 						TargetEntry *tle = (TargetEntry *) lfirst(l);
 						Var		   *var = (Var *) tle->expr;
 
-						if (var && IsA(var, Var) && var->varno == ROWID_VAR)
-							tle->expr = (Expr *) makeNullConst(var->vartype,
-															   var->vartypmod,
-															   var->varcollid);
+						if (var && IsA(var, Var))
+						{
+							if (var->varno == ROWID_VAR)
+								tle->expr = (Expr *) makeNullConst(var->vartype,
+																   var->vartypmod,
+																   var->varcollid);
+							else if (var->varno == 0)
+								tle->expr = (Expr *) makeVar(1,
+															 var->varattno,
+															 var->vartype,
+															 var->vartypmod,
+															 var->varcollid,
+															 var->varlevelsup);
+						}
 					}
 
 					splan->plan.targetlist =
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 547dbd53540..da9431108a2 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -826,7 +826,6 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	/* If all UNION children were dummy rels, make the resulting rel dummy */
 	if (cheapest_pathlist == NIL)
 	{
-		result_rel->reltarget = create_pathtarget(root, list_nth(tlist_list, 0));
 		mark_dummy_rel(result_rel);
 
 		return result_rel;
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 7c089e0d598..15931beea3a 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -1258,14 +1258,18 @@ SELECT two FROM tenk1 WHERE 1=2
 UNION
 SELECT four FROM tenk1 WHERE 1=2
 UNION
-SELECT ten FROM tenk1 WHERE 1=2;
-           QUERY PLAN           
---------------------------------
- Result
+SELECT ten FROM tenk1 WHERE 1=2
+ORDER BY 1;
+              QUERY PLAN              
+--------------------------------------
+ Sort
    Output: unnamed_subquery.two
-   Replaces: Aggregate
-   One-Time Filter: false
-(4 rows)
+   Sort Key: unnamed_subquery.two
+   ->  Result
+         Output: unnamed_subquery.two
+         Replaces: Aggregate
+         One-Time Filter: false
+(7 rows)
 
 -- Test constraint exclusion of UNION ALL subqueries
 explain (costs off)
diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql
index 56bd20e741c..e252316f69b 100644
--- a/src/test/regress/sql/union.sql
+++ b/src/test/regress/sql/union.sql
@@ -484,7 +484,8 @@ SELECT two FROM tenk1 WHERE 1=2
 UNION
 SELECT four FROM tenk1 WHERE 1=2
 UNION
-SELECT ten FROM tenk1 WHERE 1=2;
+SELECT ten FROM tenk1 WHERE 1=2
+ORDER BY 1;
 
 -- Test constraint exclusion of UNION ALL subqueries
 explain (costs off)
-- 
2.43.0

v4-0002-Teach-planner-to-short-circuit-EXCEPT-INTERSECT-w.patchapplication/octet-stream; name=v4-0002-Teach-planner-to-short-circuit-EXCEPT-INTERSECT-w.patchDownload
From f47a8904d5c4366ba7dd7e0c0672d35cd04f8e92 Mon Sep 17 00:00:00 2001
From: David Rowley <dgrowley@gmail.com>
Date: Thu, 2 Oct 2025 23:38:46 +1300
Subject: [PATCH v4 2/2] Teach planner to short-circuit EXCEPT/INTERSECT with
 dummy inputs

When either inputs of an INTERSECT [ALL] operator are proven not to return
any results (a dummy rel), then mark the entire INTERSECT operation as
dummy.

Likewise, if an EXCEPT [ALL] operation's left input is proven empty, then
mark the entire operation as dummy.

With EXCEPT ALL, we can easily handle the right input being dummy as
we can return the left input without any processing.  That can lead to
significant performance gains during query execution.  We can't easily
handle dummy right inputs for EXCEPT (without ALL), as that would require
deduplication of the left input.  Wiring up those Paths is likely more
complex than it's worth as the gains during execution aren't that great,
so let's leave that one to be handled by the normal Path generation code.

Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAApHDvri53PPF76c3M94_QNWbJfXjyCnjXuj_2=LYM-0m8WZtw@mail.gmail.com
---
 src/backend/optimizer/prep/prepunion.c | 63 +++++++++++++++++++
 src/test/regress/expected/union.out    | 86 +++++++++++++++++++++++++-
 src/test/regress/sql/union.sql         | 38 +++++++++++-
 3 files changed, 185 insertions(+), 2 deletions(-)

diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index da9431108a2..e5d9e6028b2 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -1185,6 +1185,69 @@ generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *root,
 		result_rel->reltarget = create_setop_pathtarget(root, tlist,
 														list_make2(lpath, rpath));
 
+	/* Check for provably empty setop inputs and add short-circuit paths. */
+	if (op->op == SETOP_EXCEPT)
+	{
+		/*
+		 * For EXCEPTs, if the left side is dummy then there's no need to
+		 * inspect the right-hand side as scanning the right to find tuples to
+		 * remove won't make the left-hand input any more empty.
+		 */
+		if (is_dummy_rel(lrel))
+		{
+			mark_dummy_rel(result_rel);
+
+			return result_rel;
+		}
+
+		/* Handle EXCEPTs with dummy right input */
+		if (is_dummy_rel(rrel))
+		{
+			if (op->all)
+			{
+				Path	   *apath;
+
+				/*
+				 * EXCEPT ALL: If the right-hand input is dummy then we can
+				 * simply scan the left-hand input.  To keep createplan.c
+				 * happy, use a single child Append to handle the translation
+				 * between the set op targetlist and the targetlist of the
+				 * left input.  The Append will be removed in setrefs.c.
+				 */
+				apath = (Path *) create_append_path(root, result_rel, list_make1(lpath),
+													NIL, NIL, NULL, 0, false, -1);
+
+				add_path(result_rel, apath);
+
+				return result_rel;
+			}
+			else
+			{
+				/*
+				 * To make EXCEPT with a dummy RHS work means having to
+				 * deduplicate the left input.  That could be done with
+				 * AggPaths, but doesn't seem worth the effort.  Let the
+				 * normal path generation code below handle this one.
+				 */
+			}
+		}
+	}
+	else
+	{
+		/*
+		 * For INTERSECT, if either input is a dummy rel then we can mark the
+		 * result_rel as dummy since intersecting with an empty relation can
+		 * never yield any results.  This is true regardless of INTERSECT or
+		 * INTERSECT ALL.
+		 */
+		if (is_dummy_rel(lrel) || is_dummy_rel(rrel))
+		{
+			mark_dummy_rel(result_rel);
+
+			return result_rel;
+		}
+	}
+
 	/*
 	 * Estimate number of distinct groups that we'll need hashtable entries
 	 * for; this is the size of the left-hand input for EXCEPT, or the smaller
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 15931beea3a..22dc81e4d79 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -1217,7 +1217,7 @@ select event_id
 drop table events_child, events, other_events;
 reset enable_indexonlyscan;
 --
--- Test handling of UNION with provably empty inputs
+-- Test handling of UNION / EXCEPT / INTERSECT with provably empty inputs
 --
 -- Ensure the empty UNION input is pruned and de-duplication is done for the
 -- remaining relation.
@@ -1271,6 +1271,90 @@ ORDER BY 1;
          One-Time Filter: false
 (7 rows)
 
+-- Ensure the planner provides a const-false Result node
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+INTERSECT
+SELECT four FROM tenk1
+ORDER BY 1;
+                             QUERY PLAN                              
+---------------------------------------------------------------------
+ Sort
+   Output: unnamed_subquery.two
+   Sort Key: unnamed_subquery.two
+   ->  Result
+         Output: unnamed_subquery.two
+         Replaces: Aggregate on unnamed_subquery, unnamed_subquery_1
+         One-Time Filter: false
+(7 rows)
+
+-- As above, with the inputs swapped
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT four FROM tenk1
+INTERSECT
+SELECT two FROM tenk1 WHERE 1=2
+ORDER BY 1;
+                             QUERY PLAN                              
+---------------------------------------------------------------------
+ Sort
+   Output: unnamed_subquery.four
+   Sort Key: unnamed_subquery.four
+   ->  Result
+         Output: unnamed_subquery.four
+         Replaces: Aggregate on unnamed_subquery, unnamed_subquery_1
+         One-Time Filter: false
+(7 rows)
+
+-- Try with both inputs dummy
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT four FROM tenk1 WHERE 1=2
+INTERSECT
+SELECT two FROM tenk1 WHERE 1=2
+ORDER BY 1;
+                             QUERY PLAN                              
+---------------------------------------------------------------------
+ Sort
+   Output: unnamed_subquery.four
+   Sort Key: unnamed_subquery.four
+   ->  Result
+         Output: unnamed_subquery.four
+         Replaces: Aggregate on unnamed_subquery, unnamed_subquery_1
+         One-Time Filter: false
+(7 rows)
+
+-- Ensure the planner provides a const-false Result node when the left input
+-- is empty
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+EXCEPT
+SELECT four FROM tenk1
+ORDER BY 1;
+                             QUERY PLAN                              
+---------------------------------------------------------------------
+ Sort
+   Output: unnamed_subquery.two
+   Sort Key: unnamed_subquery.two
+   ->  Result
+         Output: unnamed_subquery.two
+         Replaces: Aggregate on unnamed_subquery, unnamed_subquery_1
+         One-Time Filter: false
+(7 rows)
+
+-- Ensure the planner only scans the left input
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1
+EXCEPT ALL
+SELECT four FROM tenk1 WHERE 1=2
+ORDER BY 1;
+           QUERY PLAN           
+--------------------------------
+ Sort
+   Output: tenk1.two
+   Sort Key: tenk1.two
+   ->  Seq Scan on public.tenk1
+         Output: tenk1.two
+(5 rows)
+
 -- Test constraint exclusion of UNION ALL subqueries
 explain (costs off)
  SELECT * FROM
diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql
index e252316f69b..63b8b9a0d1d 100644
--- a/src/test/regress/sql/union.sql
+++ b/src/test/regress/sql/union.sql
@@ -460,7 +460,7 @@ drop table events_child, events, other_events;
 reset enable_indexonlyscan;
 
 --
--- Test handling of UNION with provably empty inputs
+-- Test handling of UNION / EXCEPT / INTERSECT with provably empty inputs
 --
 
 -- Ensure the empty UNION input is pruned and de-duplication is done for the
@@ -487,6 +487,42 @@ UNION
 SELECT ten FROM tenk1 WHERE 1=2
 ORDER BY 1;
 
+-- Ensure the planner provides a const-false Result node
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+INTERSECT
+SELECT four FROM tenk1
+ORDER BY 1;
+
+-- As above, with the inputs swapped
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT four FROM tenk1
+INTERSECT
+SELECT two FROM tenk1 WHERE 1=2
+ORDER BY 1;
+
+-- Try with both inputs dummy
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT four FROM tenk1 WHERE 1=2
+INTERSECT
+SELECT two FROM tenk1 WHERE 1=2
+ORDER BY 1;
+
+-- Ensure the planner provides a const-false Result node when the left input
+-- is empty
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 WHERE 1=2
+EXCEPT
+SELECT four FROM tenk1
+ORDER BY 1;
+
+-- Ensure the planner only scans the left input
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1
+EXCEPT ALL
+SELECT four FROM tenk1 WHERE 1=2
+ORDER BY 1;
+
 -- Test constraint exclusion of UNION ALL subqueries
 explain (costs off)
  SELECT * FROM
-- 
2.43.0

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#9)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

David Rowley <dgrowleyml@gmail.com> writes:

I'm just considering the best fix and can think of two options:

1) Move away from using varno==0 in generate_append_tlist(). Use varno==1, or;
2) Add handling in setrefs.c for T_Result to adjust varno==0 Vars to
use varno==1 vars.

The attached v4-0001 does #2, but wondering if #1 should be explored first.

I don't recall the details ATM, but if you poke around you will find
multiple comments complaining about how that varno-zero convention is
problematic or requires code to do something unusual. So I'd be in
favor of trying to get rid of it, but I'm not entirely sure what to do
instead, and the ramifications might be wider than you realize.

In particular it's not clear to me why varno==1 is better? As best
I can recall without diving into code, the fundamental mismatch is
that varno zero doesn't correspond to any RTE. It would be better
if the Vars matched the subquery RTEs that are at the base of the
set-operation nest, so that there were a useful referent as to where
a Var came from. Arbitrarily setting varno=1 sounds like the worst
case: we could neither identify a Var with a source subquery
accurately, nor realize that its varno is phony.

regards, tom lane

#11David Rowley
dgrowleyml@gmail.com
In reply to: Tom Lane (#10)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

On Sun, 5 Oct 2025 at 04:43, Tom Lane <tgl@sss.pgh.pa.us> wrote:

In particular it's not clear to me why varno==1 is better? As best
I can recall without diving into code, the fundamental mismatch is
that varno zero doesn't correspond to any RTE. It would be better
if the Vars matched the subquery RTEs that are at the base of the
set-operation nest, so that there were a useful referent as to where
a Var came from. Arbitrarily setting varno=1 sounds like the worst
case: we could neither identify a Var with a source subquery
accurately, nor realize that its varno is phony.

I've not tried it yet, but the idea with varno==1 is that is that it
does index the first UNION child's RTE. For the case at hand, all the
children are dummies. I thought doing this was ok because Appends and
MergeAppends show the target entries for the first child, and cases
like the following do show Vars from RTEs that don't get scanned in
the plan:

# explain verbose select * from t where 1=2 order by 1;

Sort (cost=0.01..0.02 rows=0 width=0)
Output: a, b
Sort Key: t.a
-> Result (cost=0.00..0.00 rows=0 width=0)
Output: a, b
Replaces: Scan on t
One-Time Filter: false

Another alternative that I'm thinking about which might be better is
to double-down on the varno==0 and invent a special varno and define
SETOP_VAR. I'd feel better about doing that as I didn't feel good
about coding the magic number for the if(var->varno == 0) check in
set_plan_refs() to make the T_Result work in EXPLAIN VERBOSE.

David

#12David Rowley
dgrowleyml@gmail.com
In reply to: David Rowley (#11)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

On Sun, 5 Oct 2025 at 13:56, David Rowley <dgrowleyml@gmail.com> wrote:

Another alternative that I'm thinking about which might be better is
to double-down on the varno==0 and invent a special varno and define
SETOP_VAR. I'd feel better about doing that as I didn't feel good
about coding the magic number for the if(var->varno == 0) check in
set_plan_refs() to make the T_Result work in EXPLAIN VERBOSE.

I decided not to do it that way and instead just added code to create
a varno==1 Var in setrefs.c. This basically amounts to following on
with the varno==0 hack used in prepunion.c.

The reason I didn't go down the route of SETOP_VAR was that it's still
a hack, it's just making it look a bit more official. I suppose the
correct way to fix all this and get rid of the varno==0 stuff forever
is to have a proper top-level RTE for the top-level set operation and
make it so each child is an OTHER_MEMBER rel at that query level. It
felt like going a bit too far to do something like that to fix this
bug, so I didn't explore that further.

David

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#12)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

David Rowley <dgrowleyml@gmail.com> writes:

The reason I didn't go down the route of SETOP_VAR was that it's still
a hack, it's just making it look a bit more official. I suppose the
correct way to fix all this and get rid of the varno==0 stuff forever
is to have a proper top-level RTE for the top-level set operation and
make it so each child is an OTHER_MEMBER rel at that query level. It
felt like going a bit too far to do something like that to fix this
bug, so I didn't explore that further.

Yeah, I think "more RTEs" is the ultimate solution here, but it's
never risen to the top of my to-do list either. I was kind of
thinking about an RTE per set-op child, though. Not sure if one
for the top-level op, or one for an intermediate op, would help;
though it's certainly possible it would.

regards, tom lane

#14Alexander Lakhin
exclusion@gmail.com
In reply to: David Rowley (#9)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

Hello David,

04.10.2025 06:55, David Rowley wrote:

On Fri, 3 Oct 2025 at 10:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Rowley <dgrowleyml@gmail.com> writes:

An alternative way would be to propagate those during build_setop_child_paths()

That answer works for me. I was expecting you to just document the
need for the extra check in is_dummy_rel ;-) ... but this way is
perhaps better.

So, I pushed the UNION portion earlier, but on hacking more on the
EXCEPT/INTERSECT patch, I noticed that I don't have the target lists
correct when marking the top-level set op as dummy. ...

Please look at a new anomaly, introduced with 03d40e4b5:
CREATE TABLE t(i integer);
CREATE TABLE pt(i integer) PARTITION BY LIST(i);

SET enable_seqscan = 'off';
SELECT * FROM t UNION SELECT * FROM t
UNION ALL
SELECT * FROM pt;
produces:
ERROR:  XX000: unrecognized node type: 0
LOCATION:  create_plan_recurse, createplan.c:538

Best regards.
Alexander

#15Richard Guo
guofenglinux@gmail.com
In reply to: Alexander Lakhin (#14)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

On Tue, Nov 4, 2025 at 5:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:

Please look at a new anomaly, introduced with 03d40e4b5:
CREATE TABLE t(i integer);
CREATE TABLE pt(i integer) PARTITION BY LIST(i);

SET enable_seqscan = 'off';
SELECT * FROM t UNION SELECT * FROM t
UNION ALL
SELECT * FROM pt;
produces:
ERROR: XX000: unrecognized node type: 0
LOCATION: create_plan_recurse, createplan.c:538

I looked into this. The child relation with relid 3 (the scan on the
partitioned table) is a dummy, so it is skipped in
generate_union_paths(). As a result, the final setop relation ends up
the same as the child relation with relid set to (1, 2). Then,
generate_union_paths() creates an Append path using this relation's
cheapest path as its subpath. Somehow, add_path() determines that
this new Append path dominates the original cheapest path, causing the
original cheapest path to be freed. This leads to the final Append
path referencing a subpath that has already been freed.

- Richard

#16David Rowley
dgrowleyml@gmail.com
In reply to: Alexander Lakhin (#14)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

On Tue, 4 Nov 2025 at 21:00, Alexander Lakhin <exclusion@gmail.com> wrote:

Please look at a new anomaly, introduced with 03d40e4b5:
CREATE TABLE t(i integer);
CREATE TABLE pt(i integer) PARTITION BY LIST(i);

SET enable_seqscan = 'off';
SELECT * FROM t UNION SELECT * FROM t
UNION ALL
SELECT * FROM pt;
produces:
ERROR: XX000: unrecognized node type: 0

Thanks for the report.

This seems to be due to the fact that the same UPPERREL_SETOP
RelOptInfo is being used for the UNION and UNION ALL. When doing the
add_path() for the UNION ALL at prepunion.c:1030, the sole child path
of the Append being added is an AggPath for the previous UNION
operation. When we add_path() the AppendPath, the previous AggPath is
already in the pathlist of the result_rel. add_path() ends up freeing
the old Path due to the new AppendPath being better and the end result
is the Append's subpath is free'd.

The reason we end up with the same result_rel is that we're not
passing all the relids in fetch_upper_rel(root, UPPERREL_SETOP,
relids) due to having removed dummy rels. I guess the fix might be
something like record the relids even when skipping dummy relations.
I'll go and explore that as an option.

David

#17David Rowley
dgrowleyml@gmail.com
In reply to: David Rowley (#16)
1 attachment(s)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

On Tue, 4 Nov 2025 at 22:54, David Rowley <dgrowleyml@gmail.com> wrote:

The reason we end up with the same result_rel is that we're not
passing all the relids in fetch_upper_rel(root, UPPERREL_SETOP,
relids) due to having removed dummy rels. I guess the fix might be
something like record the relids even when skipping dummy relations.
I'll go and explore that as an option.

This seems to fix it. I'll study it more in the morning (it's late in
my time zone).

David

Attachments:

minimal_fix.patchapplication/octet-stream; name=minimal_fix.patchDownload
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 55665824179..e622c022272 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -773,6 +773,12 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 		RelOptInfo *rel = lfirst(lc);
 		Path	   *ordered_path;
 
+		/*
+		 * Record the relids so that we correctly identify the correct
+		 * UPPERREL_SETOP below.
+		 */
+		relids = bms_add_members(relids, rel->relids);
+
 		/* Skip any UNION children that are proven not to yield any rows */
 		if (is_dummy_rel(rel))
 			continue;
@@ -815,8 +821,6 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 				partial_pathlist = lappend(partial_pathlist,
 										   linitial(rel->partial_pathlist));
 		}
-
-		relids = bms_add_members(relids, rel->relids);
 	}
 
 	/* Build result relation. */
#18Richard Guo
guofenglinux@gmail.com
In reply to: Richard Guo (#15)
1 attachment(s)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

On Tue, Nov 4, 2025 at 6:19 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Tue, Nov 4, 2025 at 5:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:

Please look at a new anomaly, introduced with 03d40e4b5:
CREATE TABLE t(i integer);
CREATE TABLE pt(i integer) PARTITION BY LIST(i);

SET enable_seqscan = 'off';
SELECT * FROM t UNION SELECT * FROM t
UNION ALL
SELECT * FROM pt;
produces:
ERROR: XX000: unrecognized node type: 0
LOCATION: create_plan_recurse, createplan.c:538

I looked into this. The child relation with relid 3 (the scan on the
partitioned table) is a dummy, so it is skipped in
generate_union_paths(). As a result, the final setop relation ends up
the same as the child relation with relid set to (1, 2). Then,
generate_union_paths() creates an Append path using this relation's
cheapest path as its subpath. Somehow, add_path() determines that
this new Append path dominates the original cheapest path, causing the
original cheapest path to be freed. This leads to the final Append
path referencing a subpath that has already been freed.

I think a quick fix is to always include the involved child relids
when building the relid set for the setop relation, even if the child
is dummy. A dummy child does not yield any rows, but theoretically it
is still a relation that we should account for.

- Richard

Attachments:

v1-0001-Fix-generate_union_paths.patchapplication/octet-stream; name=v1-0001-Fix-generate_union_paths.patchDownload
From 101e64979f476b376e40fd4995361f194a5414b5 Mon Sep 17 00:00:00 2001
From: Richard Guo <guofenglinux@gmail.com>
Date: Tue, 4 Nov 2025 18:57:26 +0900
Subject: [PATCH v1] Fix generate_union_paths()

---
 src/backend/optimizer/prep/prepunion.c | 8 ++++++--
 src/test/regress/expected/union.out    | 6 +++---
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 55665824179..464fc15d115 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -773,6 +773,12 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 		RelOptInfo *rel = lfirst(lc);
 		Path	   *ordered_path;
 
+		/*
+		 * Include this child's relids in the resulting relid set, even if the
+		 * child is a dummy relation.
+		 */
+		relids = bms_add_members(relids, rel->relids);
+
 		/* Skip any UNION children that are proven not to yield any rows */
 		if (is_dummy_rel(rel))
 			continue;
@@ -815,8 +821,6 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 				partial_pathlist = lappend(partial_pathlist,
 										   linitial(rel->partial_pathlist));
 		}
-
-		relids = bms_add_members(relids, rel->relids);
 	}
 
 	/* Build result relation. */
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index fb77d108337..4533967e84a 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -1260,14 +1260,14 @@ SELECT four FROM tenk1 WHERE 1=2
 UNION
 SELECT ten FROM tenk1 WHERE 1=2
 ORDER BY 1;
-              QUERY PLAN              
---------------------------------------
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
  Sort
    Output: unnamed_subquery.two
    Sort Key: unnamed_subquery.two
    ->  Result
          Output: unnamed_subquery.two
-         Replaces: Aggregate
+         Replaces: Aggregate on unnamed_subquery, unnamed_subquery_1, unnamed_subquery_2
          One-Time Filter: false
 (7 rows)
 
-- 
2.39.5 (Apple Git-154)

#19David Rowley
dgrowleyml@gmail.com
In reply to: David Rowley (#17)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

On Tue, 4 Nov 2025 at 23:00, David Rowley <dgrowleyml@gmail.com> wrote:

On Tue, 4 Nov 2025 at 22:54, David Rowley <dgrowleyml@gmail.com> wrote:

The reason we end up with the same result_rel is that we're not
passing all the relids in fetch_upper_rel(root, UPPERREL_SETOP,
relids) due to having removed dummy rels. I guess the fix might be
something like record the relids even when skipping dummy relations.
I'll go and explore that as an option.

This seems to fix it. I'll study it more in the morning (it's late in
my time zone).

I went over this again this morning. I considered adding the following test:

-- Try a more complex case with multiple chained UNIONs
EXPLAIN (COSTS OFF)
SELECT two FROM tenk1
UNION
SELECT four FROM tenk1
UNION ALL
SELECT ten FROM tenk1 WHERE 1=2;

but it seems that enable_seqscan does also need to be disabled as
otherwise add_path() just finds the new and old path to cost the same
and rejects the new path. With enable_seqscan = off,
compare_path_costs_fuzzily() will find the old path to have
disabled_node = 2 and the new one to have disabled_nodes = 0, so
accepts the new and pfree's the old.

I finally decided that it was a bit too obscure a scenario to test to
verify that the same silly mistake didn't reappear.

Thanks again for the report and the simple recreation steps.

David

#20Alexander Lakhin
exclusion@gmail.com
In reply to: David Rowley (#19)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

Hello David!

05.11.2025 00:55, David Rowley wrote:

I finally decided that it was a bit too obscure a scenario to test to
verify that the same silly mistake didn't reappear.

Thanks again for the report and the simple recreation steps.

Thank you for the fix!

But while playing around, I've just discovered another interesting error:

SET enable_seqscan = 'off';
SELECT * FROM t EXCEPT SELECT * FROM t
UNION SELECT * FROM pt;

leads to:
ERROR:  XX000: no relation entry for relid 0
LOCATION:  find_base_rel, relnode.c:541

Best regards,
Alexander

#21David Rowley
dgrowleyml@gmail.com
In reply to: Alexander Lakhin (#20)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

On Wed, 5 Nov 2025 at 17:00, Alexander Lakhin <exclusion@gmail.com> wrote:

But while playing around, I've just discovered another interesting error:

SET enable_seqscan = 'off';
SELECT * FROM t EXCEPT SELECT * FROM t
UNION SELECT * FROM pt;

leads to:
ERROR: XX000: no relation entry for relid 0
LOCATION: find_base_rel, relnode.c:541

:-(

This happens because of the code I added to try to give better
estimates to the number of groups when only a single child remains in
the UNION. Because the only surviving Path is for the EXCEPT set-op,
there are Vars with varno==0, which estimate_num_groups() does not
like. I'm considering restricting that code Path to where the
path->parent->reloptkind != RELOPT_UPPER_REL. It doesn't feel great
adding the special case, but likely having better row estimates is
worthwhile, when it's possible. I'll sit on that for a bit and see if
I can think of a better way.

(Yet another reason we should probably fix the varno==0 issue.)

Thanks for the report and reproducer, again!

David

#22David Rowley
dgrowleyml@gmail.com
In reply to: David Rowley (#21)
1 attachment(s)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

On Wed, 5 Nov 2025 at 18:26, David Rowley <dgrowleyml@gmail.com> wrote:

I'm considering restricting that code Path to where the
path->parent->reloptkind != RELOPT_UPPER_REL.

I couldn't think of anything better, so here's a patch to that effect.

David

Attachments:

v1-0001-Fix-UNION-planner-estimate_num_groups-with-varno-.patchapplication/octet-stream; name=v1-0001-Fix-UNION-planner-estimate_num_groups-with-varno-.patchDownload
From eb4641314f9f0f779d072db3d5f5da0ec151faa0 Mon Sep 17 00:00:00 2001
From: David Rowley <dgrowley@gmail.com>
Date: Wed, 5 Nov 2025 21:44:45 +1300
Subject: [PATCH v1] Fix UNION planner estimate_num_groups with varno==0

03d40e4b5 added code to provide better row estimates for when a UNION
query ended up only with a single child due to other children being
found to be dummy rels.  In that case, ordinarily it would be ok to call
estimate_num_groups() on the targetlist of the only child path, however
that's not safe to do if the UNION child is the result of some other set
operation as we generate targetlists containing varno==0 for those,
which estimate_num_groups() can't handle.

Fix this by avoiding doing this when the only child is the result of
another set operation.  In that case we'll fall back on the
assume-all-rows-are-unique method.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/cfbc99e5-9d44-4806-ba3c-f36b57a85e21@gmail.com
---
 src/backend/optimizer/prep/prepunion.c | 23 +++++++++++++----------
 src/test/regress/expected/union.out    | 22 ++++++++++++++++++++++
 src/test/regress/sql/union.sql         |  8 ++++++++
 3 files changed, 43 insertions(+), 10 deletions(-)

diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 72539545656..f528f096a56 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -901,19 +901,22 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 		double		dNumGroups;
 		bool		can_sort = grouping_is_sortable(groupList);
 		bool		can_hash = grouping_is_hashable(groupList);
+		Path	   *first_path = linitial(cheapest_pathlist);
 
-		if (list_length(cheapest_pathlist) == 1)
+		/*
+		 * Estimate the number of UNION output rows.  In the case when only a
+		 * single UNION child remains, we can use estimate_num_groups() on
+		 * that child.  We must be careful not to do this when that child is
+		 * the result of some other set operation as the targetlist will
+		 * contain Vars with varno==0, which estimate_num_groups() wouldn't
+		 * like.
+		 */
+		if (list_length(cheapest_pathlist) == 1 &&
+			first_path->parent->reloptkind != RELOPT_UPPER_REL)
 		{
-			Path	   *path = linitial(cheapest_pathlist);
-
-			/*
-			 * In the case where only one union child remains due to the
-			 * detection of one or more dummy union children, obtain an
-			 * estimate on the surviving child directly.
-			 */
 			dNumGroups = estimate_num_groups(root,
-											 path->pathtarget->exprs,
-											 path->rows,
+											 first_path->pathtarget->exprs,
+											 first_path->rows,
 											 NULL,
 											 NULL);
 		}
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 4533967e84a..709c85f2294 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -1355,6 +1355,28 @@ ORDER BY 1;
          Output: tenk1.two
 (5 rows)
 
+-- Try a mixed setop case.  Ensure the right-hand UNION child gets removed.
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 t1
+EXCEPT
+SELECT four FROM tenk1 t2
+UNION
+SELECT ten FROM tenk1 dummy WHERE 1=2;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Unique
+   Output: t1.two
+   ->  Sort
+         Output: t1.two
+         Sort Key: t1.two
+         ->  HashSetOp Except
+               Output: t1.two
+               ->  Seq Scan on public.tenk1 t1
+                     Output: t1.two
+               ->  Seq Scan on public.tenk1 t2
+                     Output: t2.four
+(11 rows)
+
 -- Test constraint exclusion of UNION ALL subqueries
 explain (costs off)
  SELECT * FROM
diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql
index 782cca23701..d0c70fafbea 100644
--- a/src/test/regress/sql/union.sql
+++ b/src/test/regress/sql/union.sql
@@ -523,6 +523,14 @@ EXCEPT ALL
 SELECT four FROM tenk1 WHERE 1=2
 ORDER BY 1;
 
+-- Try a mixed setop case.  Ensure the right-hand UNION child gets removed.
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT two FROM tenk1 t1
+EXCEPT
+SELECT four FROM tenk1 t2
+UNION
+SELECT ten FROM tenk1 dummy WHERE 1=2;
+
 -- Test constraint exclusion of UNION ALL subqueries
 explain (costs off)
  SELECT * FROM
-- 
2.43.0

#23David Rowley
dgrowleyml@gmail.com
In reply to: David Rowley (#22)
Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

On Wed, 5 Nov 2025 at 22:44, David Rowley <dgrowleyml@gmail.com> wrote:

On Wed, 5 Nov 2025 at 18:26, David Rowley <dgrowleyml@gmail.com> wrote:

I'm considering restricting that code Path to where the
path->parent->reloptkind != RELOPT_UPPER_REL.

I couldn't think of anything better, so here's a patch to that effect.

I've pushed this.

David