Avoid computing ORDER BY junk columns unnecessarily

Started by Heikki Linnakangasabout 2 years ago12 messages

Heikki Linnakangas

hlinnaka@iki.fi

about 2 years ago

5 attachment(s)

Problem
-------

We are using junk columns for (at least) two slightly different purposes:

1. For passing row IDs and other such data from lower plan nodes to
LockRows / ModifyTable.

2. To represent ORDER BY and GROUP BY columns that don't appear in the
SELECT list. For example, in a query like:

SELECT foo FROM mytable ORDER BY bar;

The parser adds 'bar' to the target list as a junk column. You can see
that with EXPLAIN VERBOSE:

explain (verbose, costs off)
select foo from mytable order by bar;

QUERY PLAN
----------------------------------
Sort
Output: foo, bar
Sort Key: mytable.bar
-> Seq Scan on public.mytable
Output: foo, bar
(5 rows)

The 'bar' column get filtered away in the executor, by the so-called
junk filter. That's fine for simple cases like the above, but in some
cases, that causes the ORDER BY value to be computed unnecessarily. For
example:

create table mytable (foo text, bar text);
insert into mytable select g, g from generate_series(1, 10000) g;
create index on mytable (sha256(bar::bytea));
explain verbose
select foo from mytable order by sha256(bar::bytea);

QUERY PLAN

------------------------------------------------------------------------------------------------
Index Scan using mytable_sha256_idx on public.mytable
(cost=0.29..737.28 rows=10000 width=64)
Output: foo, sha256((bar)::bytea)
(2 rows)

The index is used to satisfy the ORDER BY, but the expensive ORDER BY
expression is still computed for every row, just to be thrown away by
the junk filter.

This came up with pgvector, as the vector distance functions are pretty
expensive. All vector operations are expensive, so one extra distance
function call per row doesn't necessarily make that much difference, but
it sure looks silly. See
https://github.com/pgvector/pgvector/issues/359#issuecomment-1840786021
(thanks Matthias for the investigation!).

Solution
--------

The obvious solution is that the planner should not include those junk
columns in the plan. But how exactly to implement that is a different
matter.

I came up with the attached patch set, which adds a projection to all
the paths at the end of planning in grouping_planner(). The projection
filters out the unnecessary junk columns. With that, the plan for the
above example:

postgres=# explain verbose select foo from mytable order by
sha256(bar::bytea);
QUERY PLAN

-----------------------------------------------------------------------------------------------
Index Scan using mytable_sha256_idx on public.mytable
(cost=0.29..662.24 rows=10000 width=4)
Output: foo
(2 rows)

Problems with the solution
--------------------------

So this seems to work, but I have a few doubts:

1. Because Sort cannot project, this adds an extra Result node on top of
Sort nodes when the the ORDER BY is implemented by sorting:

postgres=# explain verbose select foo from mytable order by bar;
QUERY PLAN

--------------------------------------------------------------------------------
Result (cost=818.39..843.39 rows=10000 width=4)
Output: foo
-> Sort (cost=818.39..843.39 rows=10000 width=8)
Output: foo, bar
Sort Key: mytable.bar
-> Seq Scan on public.mytable (cost=0.00..154.00 rows=10000
width=8)
Output: foo, bar
(7 rows)

From a performance point of view, I think that's not as bad as it
sounds. Remember that without this patch, the executor needs to execute
the junk filter to filter out the extra column instead. It's not clear
that an extra Result is worse than that, although I haven't tried
benchmarking it though.

This makes plans for queries like above more verbose though. Perhaps we
should teach Sort (and MergeAppend) to do projection, just to avoid that?

Another solution would be to continue relying on the junk filter, if
adding the projection in the planner leads to an extra Result node.
That's a bit ugly, because then the final target list of a (sub)query
depends on the Path that's chosen.

2. Instead of tacking on the projection to the paths at the end, I first
tried modifying the code earlier in grouping_planner() that computes the
target lists for the different plan stages. That still feels like a
cleaner approach to me, although I don't think there's any difference in
the generated plans in practice. However I ran into some problems with
that approach and gave up.

I basically tried to remove the junk columns from 'final_target', and
have create_ordered_paths() create paths with the filtered target list
directly. And if there is no ORDER BY, leave out the junk columns from
'grouping_target' too, and have create_grouping_paths() generate the
final target list directly. However, create_grouping_paths() assumes
that the grouping columns are included in 'grouping_target'. And
similarly in create_ordered_paths(), some partial planning stages assume
that the ordering columns are included in 'final_target'. Those
assumptions could probably be fixed, but I ran out of steam trying to do
that.

3. I also considered if we should stop using junk columns to represent
ORDER BY / GROUP BY columns like this in the first place. Perhaps we
should have a separate list for those and not stash them in the target
list. But that seems like a much bigger change.

4. It would be nice to get rid of the junk filter in the executor
altogether. With this patch, the junk columns used for RowLocks and
ModifyTable are still included in the final target list, and are still
filtered out by the junk filter. But we could add similar projections
between the RowLocks and ModifyTable stages, to eliminate all the junk
columns at the top of the plan. ExecFilterJunk() isn't a lot of code,
but it would feel cleaner to me. I didn't try to implement that.

5. I'm not sure the categorization of junk columns that I implemented
here is the best one. It might make sense to have more categories, and
distinguish row-id columns from others for example. And ORDER BY columns
from GROUP BY columns.

Patches
-------

So the attached patches implement that idea, with the above-mentioned
problems. I think this approach is fine as it is, despite those
problems, but my planner-fu is a rusty so I really need review and a
second opinion on this.

v1-0001-Add-test-for-Min-Max-optimization-with-kNN-index-.patch
v1-0002-Show-how-ORDER-BY-expression-is-computed-unnecess.patch

These patches just add to existing tests to demonstrate the problem.

v1-0003-Turn-resjunk-into-an-enum.patch

Replace 'resjunk' boolean with an enum, so that we can distinguish
between different junk columns. The next patch uses that information to
identify junk columns that can be filtered out. It's is a separate patch
for easier review.

v1-0004-Omit-columns-from-final-tlist-that-were-only-need.patch

The main patch in this series.

v1-0005-Fix-regression-tests-caused-by-additional-Result-.patch

Regression test output changes, for all the plans with Sort that now
have Sort + Result. See "Problem with the solution" #1.

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachments:

v1-0001-Add-test-for-Min-Max-optimization-with-kNN-index-.patchtext/x-patch; charset=UTF-8; name=v1-0001-Add-test-for-Min-Max-optimization-with-kNN-index-.patchDownload

From ce31605e75a4a47add73ce5f20f875c1689bc4af Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Thu, 21 Dec 2023 18:22:55 +0200
Subject: [PATCH v1 1/5] Add test for Min/Max optimization with kNN index scan.

We didn't coverage for that combination. It works the same as other
cases, but I think it's an interesting combination to show off.
---
 src/test/regress/expected/aggregates.out | 22 ++++++++++++++++++++++
 src/test/regress/sql/aggregates.sql      |  8 ++++++++
 2 files changed, 30 insertions(+)

diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index d8271da4d1f..e7814b7527b 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -1063,6 +1063,28 @@ select f1, (select min(unique1) from tenk1 where unique1 > f1) AS gt
  -2147483647 |  0
 (5 rows)
 
+-- check kNN search
+begin;
+set local enable_seqscan = off;
+explain (costs off)
+  SELECT min(f1 <-> '(0,0)'::point) FROM polygon_tbl;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Result
+   InitPlan 1 (returns $0)
+     ->  Limit
+           ->  Index Scan using gpolygonind on polygon_tbl
+                 Order By: (f1 <-> '(0,0)'::point)
+                 Filter: ((f1 <-> '(0,0)'::point) IS NOT NULL)
+(6 rows)
+
+SELECT min(f1 <-> '(0,0)'::point) FROM polygon_tbl;
+ min 
+-----
+   0
+(1 row)
+
+rollback;
 -- check some cases that were handled incorrectly in 8.3.0
 explain (costs off)
   select distinct max(unique2) from tenk1;
diff --git a/src/test/regress/sql/aggregates.sql b/src/test/regress/sql/aggregates.sql
index 75c78be640b..50b55c91fc4 100644
--- a/src/test/regress/sql/aggregates.sql
+++ b/src/test/regress/sql/aggregates.sql
@@ -388,6 +388,14 @@ explain (costs off)
 select f1, (select min(unique1) from tenk1 where unique1 > f1) AS gt
   from int4_tbl;
 
+-- check kNN search
+begin;
+set local enable_seqscan = off;
+explain (costs off)
+  SELECT min(f1 <-> '(0,0)'::point) FROM polygon_tbl;
+SELECT min(f1 <-> '(0,0)'::point) FROM polygon_tbl;
+rollback;
+
 -- check some cases that were handled incorrectly in 8.3.0
 explain (costs off)
   select distinct max(unique2) from tenk1;
-- 
2.39.2

v1-0002-Show-how-ORDER-BY-expression-is-computed-unnecess.patchtext/x-patch; charset=UTF-8; name=v1-0002-Show-how-ORDER-BY-expression-is-computed-unnecess.patchDownload

From 33e92bfd5dd19d731a49312143c5d340530f5ee2 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Thu, 21 Dec 2023 17:46:28 +0200
Subject: [PATCH v1 2/5] Show how ORDER BY expression is computed unnecessarily
 in test

In the query in 'create_index', it's unnecessarily to compute the
distance, because it is not part of the original SELECT column
list. It is added to the target list as a junk column, because the
planner uses target list entries to represent ORDER BY columns, but
it's not needed in the executor. The junk filter in the executor will
remove it from the final result. For simple columns it hardly matters,
but if the distince function is expensive it might.

The next commits will try to address that; this commit just
demonstrates the problem.
---
 src/test/regress/expected/create_index.out | 13 +++++++------
 src/test/regress/expected/gist.out         | 15 ++++++++-------
 src/test/regress/sql/create_index.sql      |  2 +-
 src/test/regress/sql/gist.sql              |  2 +-
 4 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 446cfa678b7..1e2222929f1 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -459,13 +459,14 @@ SELECT count(*) FROM point_tbl p WHERE p.f1 ~= '(-5, -12)';
      1
 (1 row)
 
-EXPLAIN (COSTS OFF)
+EXPLAIN (VERBOSE, COSTS OFF)
 SELECT * FROM point_tbl ORDER BY f1 <-> '0,1';
-                  QUERY PLAN                  
-----------------------------------------------
- Index Only Scan using gpointind on point_tbl
-   Order By: (f1 <-> '(0,1)'::point)
-(2 rows)
+                      QUERY PLAN                      
+------------------------------------------------------
+ Index Only Scan using gpointind on pg_temp.point_tbl
+   Output: f1, (f1 <-> '(0,1)'::point)
+   Order By: (point_tbl.f1 <-> '(0,1)'::point)
+(3 rows)
 
 SELECT * FROM point_tbl ORDER BY f1 <-> '0,1';
         f1         
diff --git a/src/test/regress/expected/gist.out b/src/test/regress/expected/gist.out
index c75bbb23b6e..018b76f21d1 100644
--- a/src/test/regress/expected/gist.out
+++ b/src/test/regress/expected/gist.out
@@ -75,15 +75,16 @@ select p from gist_tbl where p <@ box(point(0,0), point(0.5, 0.5));
 (11 rows)
 
 -- Also test an index-only knn-search
-explain (costs off)
+explain (verbose, costs off)
 select p from gist_tbl where p <@ box(point(0,0), point(0.5, 0.5))
 order by p <-> point(0.201, 0.201);
-                       QUERY PLAN                       
---------------------------------------------------------
- Index Only Scan using gist_tbl_point_index on gist_tbl
-   Index Cond: (p <@ '(0.5,0.5),(0,0)'::box)
-   Order By: (p <-> '(0.201,0.201)'::point)
-(3 rows)
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Index Only Scan using gist_tbl_point_index on public.gist_tbl
+   Output: p, (p <-> '(0.201,0.201)'::point)
+   Index Cond: (gist_tbl.p <@ '(0.5,0.5),(0,0)'::box)
+   Order By: (gist_tbl.p <-> '(0.201,0.201)'::point)
+(4 rows)
 
 select p from gist_tbl where p <@ box(point(0,0), point(0.5, 0.5))
 order by p <-> point(0.201, 0.201);
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index d49ce9f3007..22c08c753a7 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -215,7 +215,7 @@ EXPLAIN (COSTS OFF)
 SELECT count(*) FROM point_tbl p WHERE p.f1 ~= '(-5, -12)';
 SELECT count(*) FROM point_tbl p WHERE p.f1 ~= '(-5, -12)';
 
-EXPLAIN (COSTS OFF)
+EXPLAIN (VERBOSE, COSTS OFF)
 SELECT * FROM point_tbl ORDER BY f1 <-> '0,1';
 SELECT * FROM point_tbl ORDER BY f1 <-> '0,1';
 
diff --git a/src/test/regress/sql/gist.sql b/src/test/regress/sql/gist.sql
index 6f1fc65f128..86c111ec500 100644
--- a/src/test/regress/sql/gist.sql
+++ b/src/test/regress/sql/gist.sql
@@ -66,7 +66,7 @@ select p from gist_tbl where p <@ box(point(0,0), point(0.5, 0.5));
 select p from gist_tbl where p <@ box(point(0,0), point(0.5, 0.5));
 
 -- Also test an index-only knn-search
-explain (costs off)
+explain (verbose, costs off)
 select p from gist_tbl where p <@ box(point(0,0), point(0.5, 0.5))
 order by p <-> point(0.201, 0.201);
 
-- 
2.39.2

v1-0003-Turn-resjunk-into-an-enum.patchtext/x-patch; charset=UTF-8; name=v1-0003-Turn-resjunk-into-an-enum.patchDownload

From c517790d2ccf6668df9a737008da9e6b1c87c003 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Thu, 21 Dec 2023 17:02:03 +0200
Subject: [PATCH v1 3/5] Turn 'resjunk' into an enum

This doesn't have any user-visible effect. It's a separate commit for
easier review.
---
 src/backend/executor/functions.c          |  6 ++---
 src/backend/executor/nodeSubplan.c        |  4 ++--
 src/backend/nodes/makefuncs.c             |  2 +-
 src/backend/optimizer/plan/createplan.c   |  8 +++----
 src/backend/optimizer/plan/planagg.c      |  2 +-
 src/backend/optimizer/plan/setrefs.c      |  3 ++-
 src/backend/optimizer/plan/subselect.c    |  2 +-
 src/backend/optimizer/prep/prepjointree.c |  4 ++--
 src/backend/optimizer/prep/preptlist.c    | 14 ++++++------
 src/backend/optimizer/prep/prepunion.c    |  8 +++----
 src/backend/optimizer/util/appendinfo.c   |  4 ++--
 src/backend/optimizer/util/inherit.c      |  6 ++---
 src/backend/optimizer/util/plancat.c      |  6 ++---
 src/backend/optimizer/util/tlist.c        |  4 ++--
 src/backend/parser/analyze.c              | 15 ++++++------
 src/backend/parser/parse_agg.c            |  4 ++--
 src/backend/parser/parse_clause.c         |  4 ++--
 src/backend/parser/parse_expr.c           |  4 ++--
 src/backend/parser/parse_merge.c          |  2 +-
 src/backend/parser/parse_relation.c       |  2 +-
 src/backend/parser/parse_target.c         |  6 ++---
 src/backend/rewrite/rewriteHandler.c      |  4 ++--
 src/backend/rewrite/rewriteSearchCycle.c  | 28 +++++++++++------------
 src/include/nodes/makefuncs.h             |  2 +-
 src/include/nodes/primnodes.h             | 23 +++++++++++++++++--
 src/include/parser/parse_target.h         |  2 +-
 26 files changed, 95 insertions(+), 74 deletions(-)

diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index bace25234c3..fff41cb9dea 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -1853,7 +1853,7 @@ check_sql_fn_retval(List *queryTreeLists,
 										  makeTargetEntry(null_expr,
 														  list_length(upper_tlist) + 1,
 														  NULL,
-														  false));
+														  NOT_JUNK));
 					upper_tlist_nontrivial = true;
 				}
 			} while (attr->attisdropped);
@@ -1899,7 +1899,7 @@ check_sql_fn_retval(List *queryTreeLists,
 									  makeTargetEntry(null_expr,
 													  list_length(upper_tlist) + 1,
 													  NULL,
-													  false));
+													  NOT_JUNK));
 				upper_tlist_nontrivial = true;
 			}
 		}
@@ -2041,7 +2041,7 @@ coerce_fn_result_column(TargetEntry *src_tle,
 	}
 	new_tle = makeTargetEntry(new_tle_expr,
 							  list_length(*upper_tlist) + 1,
-							  src_tle->resname, false);
+							  src_tle->resname, NOT_JUNK);
 	*upper_tlist = lappend(*upper_tlist, new_tle);
 	return true;
 }
diff --git a/src/backend/executor/nodeSubplan.c b/src/backend/executor/nodeSubplan.c
index c136f75ac24..cbf6d2f8e07 100644
--- a/src/backend/executor/nodeSubplan.c
+++ b/src/backend/executor/nodeSubplan.c
@@ -982,7 +982,7 @@ ExecInitSubPlan(SubPlan *subplan, PlanState *parent)
 			tle = makeTargetEntry(expr,
 								  i,
 								  NULL,
-								  false);
+								  NOT_JUNK);
 			lefttlist = lappend(lefttlist, tle);
 
 			/* Process righthand argument */
@@ -990,7 +990,7 @@ ExecInitSubPlan(SubPlan *subplan, PlanState *parent)
 			tle = makeTargetEntry(expr,
 								  i,
 								  NULL,
-								  false);
+								  NOT_JUNK);
 			righttlist = lappend(righttlist, tle);
 
 			/* Lookup the equality function (potentially cross-type) */
diff --git a/src/backend/nodes/makefuncs.c b/src/backend/nodes/makefuncs.c
index 89e77adbc75..59485f00043 100644
--- a/src/backend/nodes/makefuncs.c
+++ b/src/backend/nodes/makefuncs.c
@@ -241,7 +241,7 @@ TargetEntry *
 makeTargetEntry(Expr *expr,
 				AttrNumber resno,
 				char *resname,
-				bool resjunk)
+				JunkKind resjunk)
 {
 	TargetEntry *tle = makeNode(TargetEntry);
 
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 34ca6d4ac21..5842d31bdcd 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -844,7 +844,7 @@ build_path_tlist(PlannerInfo *root, Path *path)
 		tle = makeTargetEntry((Expr *) node,
 							  resno,
 							  NULL,
-							  false);
+							  NOT_JUNK);
 		if (sortgrouprefs)
 			tle->ressortgroupref = sortgrouprefs[resno - 1];
 
@@ -1772,7 +1772,7 @@ create_unique_plan(PlannerInfo *root, UniquePath *best_path, int flags)
 			tle = makeTargetEntry((Expr *) uniqexpr,
 								  nextresno,
 								  NULL,
-								  false);
+								  NOT_JUNK);
 			newtlist = lappend(newtlist, tle);
 			nextresno++;
 			newitems = true;
@@ -3156,7 +3156,7 @@ create_indexscan_plan(PlannerInfo *root,
 		{
 			TargetEntry *indextle = (TargetEntry *) lfirst(l);
 
-			indextle->resjunk = !indexinfo->canreturn[i];
+			indextle->resjunk = indexinfo->canreturn[i] ? NOT_JUNK: JUNK_PLANNER_ONLY;
 			i++;
 		}
 	}
@@ -6270,7 +6270,7 @@ prepare_sort_from_pathkeys(Plan *lefttree, List *pathkeys,
 			tle = makeTargetEntry(copyObject(em->em_expr),
 								  list_length(tlist) + 1,
 								  NULL,
-								  true);
+								  JUNK_SORT_GROUP_COL);
 			tlist = lappend(tlist, tle);
 			lefttree->targetlist = tlist;	/* just in case NIL before */
 		}
diff --git a/src/backend/optimizer/plan/planagg.c b/src/backend/optimizer/plan/planagg.c
index a6090167f5a..624afae6bce 100644
--- a/src/backend/optimizer/plan/planagg.c
+++ b/src/backend/optimizer/plan/planagg.c
@@ -371,7 +371,7 @@ build_minmax_path(PlannerInfo *root, MinMaxAggInfo *mminfo,
 	tle = makeTargetEntry(copyObject(mminfo->target),
 						  (AttrNumber) 1,
 						  pstrdup("agg_target"),
-						  false);
+						  NOT_JUNK);
 	tlist = list_make1(tle);
 	subroot->processed_tlist = parse->targetList = tlist;
 
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 4bb68ac90e7..8018a0c536a 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1474,6 +1474,7 @@ trivial_subqueryscan(SubqueryScan *plan)
 		TargetEntry *ptle = (TargetEntry *) lfirst(lp);
 		TargetEntry *ctle = (TargetEntry *) lfirst(lc);
 
+		/* XXX: do the kind has to match exactly? */
 		if (ptle->resjunk != ctle->resjunk)
 			return false;		/* tlist doesn't match junk status */
 
@@ -2568,7 +2569,7 @@ convert_combining_aggrefs(Node *node, void *context)
 		 * And set up parent_agg to represent the second phase.
 		 */
 		parent_agg->args = list_make1(makeTargetEntry((Expr *) child_agg,
-													  1, NULL, false));
+													  1, NULL, NOT_JUNK));
 		mark_partial_aggref(parent_agg, AGGSPLIT_FINAL_DESERIAL);
 
 		return (Node *) parent_agg;
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 7a9fe88fec3..6ec8fe8abb5 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -1826,7 +1826,7 @@ convert_EXISTS_to_ANY(PlannerInfo *root, Query *subselect,
 						makeTargetEntry((Expr *) rightarg,
 										resno++,
 										NULL,
-										false));
+										NOT_JUNK));
 		testlist = lappend(testlist,
 						   make_opclause(opid, BOOLOID, false,
 										 (Expr *) leftarg, (Expr *) param,
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index 73ff40721c9..8ba6b59a6e6 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -1658,7 +1658,7 @@ pull_up_simple_values(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte)
 						makeTargetEntry((Expr *) lfirst(lc),
 										attrno,
 										NULL,
-										false));
+										NOT_JUNK));
 		attrno++;
 	}
 	rvcontext.root = root;
@@ -1816,7 +1816,7 @@ pull_up_constant_function(PlannerInfo *root, Node *jtnode,
 	rvcontext.targetlist = list_make1(makeTargetEntry((Expr *) rtf->funcexpr,
 													  1,	/* resno */
 													  NULL, /* resname */
-													  false));	/* resjunk */
+													  NOT_JUNK));	/* resjunk */
 	rvcontext.target_rte = rte;
 
 	/*
diff --git a/src/backend/optimizer/prep/preptlist.c b/src/backend/optimizer/prep/preptlist.c
index 9d46488ef7c..a15b9aca0af 100644
--- a/src/backend/optimizer/prep/preptlist.c
+++ b/src/backend/optimizer/prep/preptlist.c
@@ -177,7 +177,7 @@ preprocess_targetlist(PlannerInfo *root)
 
 				tle = makeTargetEntry((Expr *) var,
 									  list_length(tlist) + 1,
-									  NULL, true);
+									  NULL, JUNK_OTHER);
 				tlist = lappend(tlist, tle);
 			}
 			list_free(vars);
@@ -223,7 +223,7 @@ preprocess_targetlist(PlannerInfo *root)
 			tle = makeTargetEntry((Expr *) var,
 								  list_length(tlist) + 1,
 								  pstrdup(resname),
-								  true);
+								  JUNK_OTHER);
 			tlist = lappend(tlist, tle);
 		}
 		if (rc->allMarkTypes & (1 << ROW_MARK_COPY))
@@ -232,12 +232,12 @@ preprocess_targetlist(PlannerInfo *root)
 			var = makeWholeRowVar(rt_fetch(rc->rti, range_table),
 								  rc->rti,
 								  0,
-								  false);
+								  NOT_JUNK);
 			snprintf(resname, sizeof(resname), "wholerow%u", rc->rowmarkId);
 			tle = makeTargetEntry((Expr *) var,
 								  list_length(tlist) + 1,
 								  pstrdup(resname),
-								  true);
+								  JUNK_OTHER);
 			tlist = lappend(tlist, tle);
 		}
 
@@ -254,7 +254,7 @@ preprocess_targetlist(PlannerInfo *root)
 			tle = makeTargetEntry((Expr *) var,
 								  list_length(tlist) + 1,
 								  pstrdup(resname),
-								  true);
+								  JUNK_OTHER);
 			tlist = lappend(tlist, tle);
 		}
 	}
@@ -290,7 +290,7 @@ preprocess_targetlist(PlannerInfo *root)
 			tle = makeTargetEntry((Expr *) var,
 								  list_length(tlist) + 1,
 								  NULL,
-								  true);
+								  JUNK_OTHER);
 
 			tlist = lappend(tlist, tle);
 		}
@@ -442,7 +442,7 @@ expand_insert_targetlist(List *tlist, Relation rel)
 			new_tle = makeTargetEntry((Expr *) new_expr,
 									  attrno,
 									  pstrdup(NameStr(att_tup->attname)),
-									  false);
+									  NOT_JUNK);
 		}
 
 		new_tlist = lappend(new_tlist, new_tle);
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 8eaa734916d..7567e7eb9f0 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -1227,7 +1227,7 @@ generate_setop_tlist(List *colTypes, List *colCollations,
 		tle = makeTargetEntry((Expr *) expr,
 							  (AttrNumber) resno++,
 							  pstrdup(reftle->resname),
-							  false);
+							  NOT_JUNK);
 
 		/*
 		 * By convention, all non-resjunk columns in a setop tree have
@@ -1253,7 +1253,7 @@ generate_setop_tlist(List *colTypes, List *colCollations,
 		tle = makeTargetEntry((Expr *) expr,
 							  (AttrNumber) resno++,
 							  pstrdup("flag"),
-							  true);
+							  JUNK_OTHER);
 		tlist = lappend(tlist, tle);
 		*trivial_tlist = false; /* the extra entry makes it not trivial */
 	}
@@ -1361,7 +1361,7 @@ generate_append_tlist(List *colTypes, List *colCollations,
 		tle = makeTargetEntry((Expr *) expr,
 							  (AttrNumber) resno++,
 							  pstrdup(reftle->resname),
-							  false);
+							  NOT_JUNK);
 
 		/*
 		 * By convention, all non-resjunk columns in a setop tree have
@@ -1386,7 +1386,7 @@ generate_append_tlist(List *colTypes, List *colCollations,
 		tle = makeTargetEntry((Expr *) expr,
 							  (AttrNumber) resno++,
 							  pstrdup("flag"),
-							  true);
+							  JUNK_OTHER);
 		tlist = lappend(tlist, tle);
 	}
 
diff --git a/src/backend/optimizer/util/appendinfo.c b/src/backend/optimizer/util/appendinfo.c
index f456b3b0a44..4c33ebc7f51 100644
--- a/src/backend/optimizer/util/appendinfo.c
+++ b/src/backend/optimizer/util/appendinfo.c
@@ -810,7 +810,7 @@ add_row_identity_var(PlannerInfo *root, Var *orig_var,
 		tle = makeTargetEntry((Expr *) orig_var,
 							  list_length(root->processed_tlist) + 1,
 							  pstrdup(rowid_name),
-							  true);
+							  JUNK_OTHER);
 		root->processed_tlist = lappend(root->processed_tlist, tle);
 		return;
 	}
@@ -869,7 +869,7 @@ add_row_identity_var(PlannerInfo *root, Var *orig_var,
 	tle = makeTargetEntry((Expr *) rowid_var,
 						  list_length(root->processed_tlist) + 1,
 						  pstrdup(rowid_name),
-						  true);
+						  JUNK_OTHER);
 	root->processed_tlist = lappend(root->processed_tlist, tle);
 }
 
diff --git a/src/backend/optimizer/util/inherit.c b/src/backend/optimizer/util/inherit.c
index f9d3ff1e7ac..8b8d7470711 100644
--- a/src/backend/optimizer/util/inherit.c
+++ b/src/backend/optimizer/util/inherit.c
@@ -260,7 +260,7 @@ expand_inherited_rtentry(PlannerInfo *root, RelOptInfo *rel,
 			tle = makeTargetEntry((Expr *) var,
 								  list_length(root->processed_tlist) + 1,
 								  pstrdup(resname),
-								  true);
+								  JUNK_OTHER);
 			root->processed_tlist = lappend(root->processed_tlist, tle);
 			newvars = lappend(newvars, var);
 		}
@@ -277,7 +277,7 @@ expand_inherited_rtentry(PlannerInfo *root, RelOptInfo *rel,
 			tle = makeTargetEntry((Expr *) var,
 								  list_length(root->processed_tlist) + 1,
 								  pstrdup(resname),
-								  true);
+								  JUNK_OTHER);
 			root->processed_tlist = lappend(root->processed_tlist, tle);
 			newvars = lappend(newvars, var);
 		}
@@ -295,7 +295,7 @@ expand_inherited_rtentry(PlannerInfo *root, RelOptInfo *rel,
 			tle = makeTargetEntry((Expr *) var,
 								  list_length(root->processed_tlist) + 1,
 								  pstrdup(resname),
-								  true);
+								  JUNK_OTHER);
 			root->processed_tlist = lappend(root->processed_tlist, tle);
 			newvars = lappend(newvars, var);
 		}
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 0e35b9d0ab9..219040b52f0 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1759,7 +1759,7 @@ build_physical_tlist(PlannerInfo *root, RelOptInfo *rel)
 								makeTargetEntry((Expr *) var,
 												attrno,
 												NULL,
-												false));
+												NOT_JUNK));
 			}
 
 			table_close(relation, NoLock);
@@ -1812,7 +1812,7 @@ build_physical_tlist(PlannerInfo *root, RelOptInfo *rel)
 								makeTargetEntry((Expr *) var,
 												var->varattno,
 												NULL,
-												false));
+												NOT_JUNK));
 			}
 			break;
 
@@ -1881,7 +1881,7 @@ build_index_tlist(PlannerInfo *root, IndexOptInfo *index,
 						makeTargetEntry(indexvar,
 										i + 1,
 										NULL,
-										false));
+										NOT_JUNK));
 	}
 	if (indexpr_item != NULL)
 		elog(ERROR, "wrong number of index expressions");
diff --git a/src/backend/optimizer/util/tlist.c b/src/backend/optimizer/util/tlist.c
index c672b338c0d..a2a343e960e 100644
--- a/src/backend/optimizer/util/tlist.c
+++ b/src/backend/optimizer/util/tlist.c
@@ -145,7 +145,7 @@ add_to_flat_tlist(List *tlist, List *exprs)
 			tle = makeTargetEntry(copyObject(expr), /* copy needed?? */
 								  next_resno++,
 								  NULL,
-								  false);
+								  NOT_JUNK);
 			tlist = lappend(tlist, tle);
 		}
 	}
@@ -636,7 +636,7 @@ make_tlist_from_pathtarget(PathTarget *target)
 		tle = makeTargetEntry(expr,
 							  i + 1,
 							  NULL,
-							  false);
+							  NOT_JUNK);
 		if (target->sortgrouprefs)
 			tle->ressortgroupref = target->sortgrouprefs[i];
 		tlist = lappend(tlist, tle);
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index 7a1dfb63645..9e032d917f7 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -953,7 +953,7 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
 		tle = makeTargetEntry(expr,
 							  attr_num,
 							  col->name,
-							  false);
+							  NOT_JUNK);
 		qry->targetList = lappend(qry->targetList, tle);
 
 		perminfo->insertedCols = bms_add_member(perminfo->insertedCols,
@@ -1250,7 +1250,7 @@ BuildOnConflictExcludedTargetlist(Relation targetrel,
 		te = makeTargetEntry((Expr *) var,
 							 attno + 1,
 							 name,
-							 false);
+							 NOT_JUNK);
 
 		result = lappend(result, te);
 	}
@@ -1265,7 +1265,8 @@ BuildOnConflictExcludedTargetlist(Relation targetrel,
 	var = makeVar(exclRelIndex, InvalidAttrNumber,
 				  targetrel->rd_rel->reltype,
 				  -1, InvalidOid, 0);
-	te = makeTargetEntry((Expr *) var, InvalidAttrNumber, NULL, true);
+	// XXX: right junk kind?
+	te = makeTargetEntry((Expr *) var, InvalidAttrNumber, NULL, JUNK_OTHER);
 	result = lappend(result, te);
 
 	return result;
@@ -1832,7 +1833,7 @@ transformSetOperationStmt(ParseState *pstate, SelectStmt *stmt)
 		tle = makeTargetEntry((Expr *) var,
 							  (AttrNumber) pstate->p_next_resno++,
 							  colName,
-							  false);
+							  NOT_JUNK);
 		qry->targetList = lappend(qry->targetList, tle);
 		targetvars = lappend(targetvars, var);
 		targetnames = lappend(targetnames, makeString(colName));
@@ -2306,7 +2307,7 @@ transformSetOperationTree(ParseState *pstate, SelectStmt *stmt,
 				restle = makeTargetEntry((Expr *) rescolnode,
 										 0, /* no need to set resno */
 										 NULL,
-										 false);
+										 NOT_JUNK);
 				*targetlist = lappend(*targetlist, restle);
 			}
 		}
@@ -2360,7 +2361,7 @@ determineRecursiveColTypes(ParseState *pstate, Node *larg, List *nrtargetlist)
 		tle = makeTargetEntry(nrtle->expr,
 							  next_resno++,
 							  colName,
-							  false);
+							  NOT_JUNK);
 		targetList = lappend(targetList, tle);
 	}
 
@@ -2382,7 +2383,7 @@ transformReturnStmt(ParseState *pstate, ReturnStmt *stmt)
 	qry->isReturn = true;
 
 	qry->targetList = list_make1(makeTargetEntry((Expr *) transformExpr(pstate, stmt->returnval, EXPR_KIND_SELECT_TARGET),
-												 1, NULL, false));
+												 1, NULL, NOT_JUNK));
 
 	if (pstate->p_resolve_unknowns)
 		resolveTargetListUnknowns(pstate, qry->targetList);
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 9bbad33fbdb..92559a5cd72 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -139,7 +139,7 @@ transformAggregateCall(ParseState *pstate, Aggref *agg,
 			TargetEntry *tle;
 
 			/* We don't bother to assign column names to the entries */
-			tle = makeTargetEntry(arg, attno++, NULL, false);
+			tle = makeTargetEntry(arg, attno++, NULL, NOT_JUNK);
 			tlist = lappend(tlist, tle);
 
 			torder = addTargetToSortList(pstate, tle,
@@ -163,7 +163,7 @@ transformAggregateCall(ParseState *pstate, Aggref *agg,
 			TargetEntry *tle;
 
 			/* We don't bother to assign column names to the entries */
-			tle = makeTargetEntry(arg, attno++, NULL, false);
+			tle = makeTargetEntry(arg, attno++, NULL, NOT_JUNK);
 			tlist = lappend(tlist, tle);
 		}
 
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index 334b9b42bd5..9f33ce97cd0 100644
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -2201,11 +2201,11 @@ findTargetlistEntrySQL99(ParseState *pstate, Node *node, List **tlist,
 
 	/*
 	 * If no matches, construct a new target entry which is appended to the
-	 * end of the target list.  This target is given resjunk = true so that it
+	 * end of the target list.  This target is given resjunk = true XXX so that it
 	 * will not be projected into the final tuple.
 	 */
 	target_result = transformTargetEntry(pstate, node, expr, exprKind,
-										 NULL, true);
+										 NULL, JUNK_SORT_GROUP_COL);
 
 	*tlist = lappend(*tlist, target_result);
 
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 64c582c344c..b04586345f7 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -1480,7 +1480,7 @@ transformMultiAssignRef(ParseState *pstate, MultiAssignRef *maref)
 			 * about selecting a resno for it; transformUpdateStmt will do
 			 * that.
 			 */
-			tle = makeTargetEntry((Expr *) sublink, 0, NULL, true);
+			tle = makeTargetEntry((Expr *) sublink, 0, NULL, JUNK_OTHER);
 			pstate->p_multiassign_exprs = lappend(pstate->p_multiassign_exprs,
 												  tle);
 
@@ -1509,7 +1509,7 @@ transformMultiAssignRef(ParseState *pstate, MultiAssignRef *maref)
 			 * Temporarily append it to p_multiassign_exprs, so we can get it
 			 * back when we come back here for additional columns.
 			 */
-			tle = makeTargetEntry((Expr *) rexpr, 0, NULL, true);
+			tle = makeTargetEntry((Expr *) rexpr, 0, NULL, JUNK_OTHER);
 			pstate->p_multiassign_exprs = lappend(pstate->p_multiassign_exprs,
 												  tle);
 		}
diff --git a/src/backend/parser/parse_merge.c b/src/backend/parser/parse_merge.c
index 91b1156d991..1c7456c6f7f 100644
--- a/src/backend/parser/parse_merge.c
+++ b/src/backend/parser/parse_merge.c
@@ -358,7 +358,7 @@ transformMergeStmt(ParseState *pstate, MergeStmt *stmt)
 						tle = makeTargetEntry(expr,
 											  attr_num,
 											  col->name,
-											  false);
+											  NOT_JUNK);
 						action->targetList = lappend(action->targetList, tle);
 
 						perminfo->insertedCols =
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 864ea9b0d5d..77885e2b4e5 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -3221,7 +3221,7 @@ expandNSItemAttrs(ParseState *pstate, ParseNamespaceItem *nsitem,
 		te = makeTargetEntry((Expr *) varnode,
 							 (AttrNumber) pstate->p_next_resno++,
 							 label,
-							 false);
+							 NOT_JUNK);
 		te_list = lappend(te_list, te);
 
 		if (require_col_privs)
diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c
index 3bc62ac3ba5..3d684506049 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -78,7 +78,7 @@ transformTargetEntry(ParseState *pstate,
 					 Node *expr,
 					 ParseExprKind exprKind,
 					 char *colname,
-					 bool resjunk)
+					 JunkKind resjunk)
 {
 	/* Transform the node if caller didn't do it already */
 	if (expr == NULL)
@@ -184,7 +184,7 @@ transformTargetList(ParseState *pstate, List *targetlist,
 												NULL,
 												exprKind,
 												res->name,
-												false));
+												NOT_JUNK));
 	}
 
 	/*
@@ -1480,7 +1480,7 @@ ExpandRowReference(ParseState *pstate, Node *expr,
 			te = makeTargetEntry((Expr *) fselect,
 								 (AttrNumber) pstate->p_next_resno++,
 								 pstrdup(NameStr(att->attname)),
-								 false);
+								 NOT_JUNK);
 			result = lappend(result, te);
 		}
 		else
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index 41a362310a8..ee6cd97f6a3 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -1020,7 +1020,7 @@ rewriteTargetListIU(List *targetList,
 				new_tle = makeTargetEntry((Expr *) new_expr,
 										  attrno,
 										  pstrdup(NameStr(att_tup->attname)),
-										  false);
+										  NOT_JUNK);
 		}
 
 		if (new_tle)
@@ -1782,7 +1782,7 @@ ApplyRetrieveRule(Query *parsetree,
 			tle = makeTargetEntry((Expr *) var,
 								  list_length(parsetree->targetList) + 1,
 								  pstrdup("wholerow"),
-								  true);
+								  JUNK_OTHER);
 
 			parsetree->targetList = lappend(parsetree->targetList, tle);
 
diff --git a/src/backend/rewrite/rewriteSearchCycle.c b/src/backend/rewrite/rewriteSearchCycle.c
index 428a98ef2bb..767c7fe5b38 100644
--- a/src/backend/rewrite/rewriteSearchCycle.c
+++ b/src/backend/rewrite/rewriteSearchCycle.c
@@ -306,7 +306,7 @@ rewriteSearchAndCycle(CommonTableExpr *cte)
 					  list_nth_int(cte->ctecoltypmods, i),
 					  list_nth_oid(cte->ctecolcollations, i),
 					  0);
-		tle = makeTargetEntry((Expr *) var, i + 1, strVal(list_nth(cte->ctecolnames, i)), false);
+		tle = makeTargetEntry((Expr *) var, i + 1, strVal(list_nth(cte->ctecolnames, i)), NOT_JUNK);
 		tle->resorigtbl = list_nth_node(TargetEntry, rte1->subquery->targetList, i)->resorigtbl;
 		tle->resorigcol = list_nth_node(TargetEntry, rte1->subquery->targetList, i)->resorigcol;
 		newq1->targetList = lappend(newq1->targetList, tle);
@@ -330,7 +330,7 @@ rewriteSearchAndCycle(CommonTableExpr *cte)
 		tle = makeTargetEntry(texpr,
 							  list_length(newq1->targetList) + 1,
 							  cte->search_clause->search_seq_column,
-							  false);
+							  NOT_JUNK);
 		newq1->targetList = lappend(newq1->targetList, tle);
 	}
 	if (cte->cycle_clause)
@@ -338,13 +338,13 @@ rewriteSearchAndCycle(CommonTableExpr *cte)
 		tle = makeTargetEntry((Expr *) cte->cycle_clause->cycle_mark_default,
 							  list_length(newq1->targetList) + 1,
 							  cte->cycle_clause->cycle_mark_column,
-							  false);
+							  NOT_JUNK);
 		newq1->targetList = lappend(newq1->targetList, tle);
 		cycle_col_rowexpr = make_path_rowexpr(cte, cte->cycle_clause->cycle_col_list);
 		tle = makeTargetEntry(make_path_initial_array(cycle_col_rowexpr),
 							  list_length(newq1->targetList) + 1,
 							  cte->cycle_clause->cycle_path_column,
-							  false);
+							  NOT_JUNK);
 		newq1->targetList = lappend(newq1->targetList, tle);
 	}
 
@@ -426,7 +426,7 @@ rewriteSearchAndCycle(CommonTableExpr *cte)
 		tle = makeTargetEntry((Expr *) var,
 							  list_length(newsubquery->targetList) + 1,
 							  cte->search_clause->search_seq_column,
-							  false);
+							  NOT_JUNK);
 		newsubquery->targetList = lappend(newsubquery->targetList, tle);
 	}
 	if (cte->cycle_clause)
@@ -441,7 +441,7 @@ rewriteSearchAndCycle(CommonTableExpr *cte)
 		tle = makeTargetEntry((Expr *) var,
 							  list_length(newsubquery->targetList) + 1,
 							  cte->cycle_clause->cycle_mark_column,
-							  false);
+							  NOT_JUNK);
 		newsubquery->targetList = lappend(newsubquery->targetList, tle);
 
 		/* ctename.cpa */
@@ -450,7 +450,7 @@ rewriteSearchAndCycle(CommonTableExpr *cte)
 		tle = makeTargetEntry((Expr *) var,
 							  list_length(newsubquery->targetList) + 1,
 							  cte->cycle_clause->cycle_path_column,
-							  false);
+							  NOT_JUNK);
 		newsubquery->targetList = lappend(newsubquery->targetList, tle);
 	}
 
@@ -494,7 +494,7 @@ rewriteSearchAndCycle(CommonTableExpr *cte)
 					  list_nth_int(cte->ctecoltypmods, i),
 					  list_nth_oid(cte->ctecolcollations, i),
 					  0);
-		tle = makeTargetEntry((Expr *) var, i + 1, strVal(list_nth(cte->ctecolnames, i)), false);
+		tle = makeTargetEntry((Expr *) var, i + 1, strVal(list_nth(cte->ctecolnames, i)), NOT_JUNK);
 		tle->resorigtbl = list_nth_node(TargetEntry, rte2->subquery->targetList, i)->resorigtbl;
 		tle->resorigcol = list_nth_node(TargetEntry, rte2->subquery->targetList, i)->resorigcol;
 		newq2->targetList = lappend(newq2->targetList, tle);
@@ -537,7 +537,7 @@ rewriteSearchAndCycle(CommonTableExpr *cte)
 		tle = makeTargetEntry(texpr,
 							  list_length(newq2->targetList) + 1,
 							  cte->search_clause->search_seq_column,
-							  false);
+							  NOT_JUNK);
 		newq2->targetList = lappend(newq2->targetList, tle);
 	}
 
@@ -572,7 +572,7 @@ rewriteSearchAndCycle(CommonTableExpr *cte)
 		tle = makeTargetEntry((Expr *) caseexpr,
 							  list_length(newq2->targetList) + 1,
 							  cte->cycle_clause->cycle_mark_column,
-							  false);
+							  NOT_JUNK);
 		newq2->targetList = lappend(newq2->targetList, tle);
 
 		/*
@@ -581,7 +581,7 @@ rewriteSearchAndCycle(CommonTableExpr *cte)
 		tle = makeTargetEntry(make_path_cat_expr(cycle_col_rowexpr, cpa_attno),
 							  list_length(newq2->targetList) + 1,
 							  cte->cycle_clause->cycle_path_column,
-							  false);
+							  NOT_JUNK);
 		newq2->targetList = lappend(newq2->targetList, tle);
 	}
 
@@ -636,7 +636,7 @@ rewriteSearchAndCycle(CommonTableExpr *cte)
 																		search_seq_type, -1, InvalidOid, 0),
 													   list_length(ctequery->targetList) + 1,
 													   cte->search_clause->search_seq_column,
-													   false));
+													   NOT_JUNK));
 	}
 	if (cte->cycle_clause)
 	{
@@ -647,13 +647,13 @@ rewriteSearchAndCycle(CommonTableExpr *cte)
 																		cte->cycle_clause->cycle_mark_collation, 0),
 													   list_length(ctequery->targetList) + 1,
 													   cte->cycle_clause->cycle_mark_column,
-													   false));
+													   NOT_JUNK));
 		ctequery->targetList = lappend(ctequery->targetList,
 									   makeTargetEntry((Expr *) makeVar(1, cpa_attno,
 																		RECORDARRAYOID, -1, InvalidOid, 0),
 													   list_length(ctequery->targetList) + 1,
 													   cte->cycle_clause->cycle_path_column,
-													   false));
+													   NOT_JUNK));
 	}
 
 	/*
diff --git a/src/include/nodes/makefuncs.h b/src/include/nodes/makefuncs.h
index aca0ee54dfa..56115960279 100644
--- a/src/include/nodes/makefuncs.h
+++ b/src/include/nodes/makefuncs.h
@@ -42,7 +42,7 @@ extern Var *makeWholeRowVar(RangeTblEntry *rte,
 extern TargetEntry *makeTargetEntry(Expr *expr,
 									AttrNumber resno,
 									char *resname,
-									bool resjunk);
+									JunkKind junk_kind);
 
 extern TargetEntry *flatCopyTargetEntry(TargetEntry *src_tle);
 
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index bb930afb521..ba7c0e1642d 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -21,6 +21,24 @@
 #include "nodes/bitmapset.h"
 #include "nodes/pg_list.h"
 
+/*
+ * These are numbered so that to check if a column is junk or not, you can
+ * use "if (resjunk)" or "if (!resjunk)".
+ */
+typedef enum JunkKind
+{
+	/* Regular column, not junk */
+	NOT_JUNK = 0,
+
+	/* An ORDER or GROUP BY column. These are left out from the final plan. */	
+	JUNK_SORT_GROUP_COL = 1,
+
+	/* Other junk column used only in planner. Also left out from final plan. */	
+	JUNK_PLANNER_ONLY,
+	
+	/* Other junk column, needs to be preserved in the final plan. */
+	JUNK_OTHER,
+} JunkKind;
 
 typedef enum OverridingKind
 {
@@ -1908,13 +1926,14 @@ typedef struct InferenceElem
  * simple reference to a column of a base table (or view).  If it is not
  * a simple reference, these fields are zeroes.
  *
- * If resjunk is true then the column is a working column (such as a sort key)
+ * If XXX resjunk is true then the column is a working column (such as a sort key)
  * that should be removed from the final output of the query.  Resjunk columns
  * must have resnos that cannot duplicate any regular column's resno.  Also
  * note that there are places that assume resjunk columns come after non-junk
  * columns.
  *--------------------
  */
+
 typedef struct TargetEntry
 {
 	Expr		xpr;
@@ -1931,7 +1950,7 @@ typedef struct TargetEntry
 	/* column's number in source table */
 	AttrNumber	resorigcol pg_node_attr(query_jumble_ignore);
 	/* set to true to eliminate the attribute from final target list */
-	bool		resjunk pg_node_attr(query_jumble_ignore);
+	JunkKind	resjunk pg_node_attr(query_jumble_ignore);
 } TargetEntry;
 
 
diff --git a/src/include/parser/parse_target.h b/src/include/parser/parse_target.h
index d84c311bbca..92a78808a77 100644
--- a/src/include/parser/parse_target.h
+++ b/src/include/parser/parse_target.h
@@ -25,7 +25,7 @@ extern void resolveTargetListUnknowns(ParseState *pstate, List *targetlist);
 extern void markTargetListOrigins(ParseState *pstate, List *targetlist);
 extern TargetEntry *transformTargetEntry(ParseState *pstate,
 										 Node *node, Node *expr, ParseExprKind exprKind,
-										 char *colname, bool resjunk);
+										 char *colname, JunkKind resjunk);
 extern Expr *transformAssignedExpr(ParseState *pstate, Expr *expr,
 								   ParseExprKind exprKind,
 								   const char *colname,
-- 
2.39.2

v1-0004-Omit-columns-from-final-tlist-that-were-only-need.patchtext/x-patch; charset=UTF-8; name=v1-0004-Omit-columns-from-final-tlist-that-were-only-need.patchDownload

From 6dba9a08ec483463b5323079f386e482e9f009c3 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Thu, 21 Dec 2023 19:01:15 +0200
Subject: [PATCH v1 4/5] Omit columns from final tlist that were only needed to
 represent ORDER BY

---
 src/backend/optimizer/path/pathkeys.c      | 13 ++--
 src/backend/optimizer/plan/createplan.c    |  4 +-
 src/backend/optimizer/plan/planagg.c       |  2 +
 src/backend/optimizer/plan/planner.c       | 55 ++++++++++++++
 src/backend/optimizer/util/plancat.c       |  6 ++
 src/backend/optimizer/util/tlist.c         | 20 +++++
 src/include/nodes/pathnodes.h              |  3 +
 src/include/optimizer/optimizer.h          |  2 +
 src/test/regress/expected/create_index.out |  2 +-
 src/test/regress/expected/gist.out         |  6 +-
 src/test/regress/expected/groupingsets.out |  8 +-
 src/test/regress/expected/limit.out        | 16 ++--
 src/test/regress/expected/sqljson.out      | 12 +--
 src/test/regress/expected/subselect.out    |  2 +-
 src/test/regress/expected/window.out       | 88 +++++++++++-----------
 15 files changed, 163 insertions(+), 76 deletions(-)

diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index fdb60aaa8d2..d95eed80a29 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -877,13 +877,15 @@ convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel,
 			TargetEntry *tle;
 			Var		   *outer_var;
 
+			/* Is the TLE actually available to the outer query? */
 			if (sub_eclass->ec_sortref == 0)	/* can't happen */
 				elog(ERROR, "volatile EquivalenceClass has no sortref");
-			tle = get_sortgroupref_tle(sub_eclass->ec_sortref, subquery_tlist);
-			Assert(tle);
-			/* Is TLE actually available to the outer query? */
-			outer_var = find_var_for_subquery_tle(rel, tle);
-			if (outer_var)
+			tle = get_sortgroupref_tle_noerr(sub_eclass->ec_sortref, subquery_tlist);
+			if (tle)
+			{
+				outer_var = find_var_for_subquery_tle(rel, tle);
+				if (outer_var)
+					/* XXX: funny indentation just to avoid unnecessary churn in review; needs to be pgindented */
 			{
 				/* We can represent this sub_pathkey */
 				EquivalenceMember *sub_member;
@@ -922,6 +924,7 @@ convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel,
 											   sub_pathkey->pk_strategy,
 											   sub_pathkey->pk_nulls_first);
 			}
+			}
 		}
 		else
 		{
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 5842d31bdcd..b4a672a371e 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -354,7 +354,7 @@ create_plan(PlannerInfo *root, Path *best_path)
 	 * nodes don't have a tlist matching the querytree targetlist.
 	 */
 	if (!IsA(plan, ModifyTable))
-		apply_tlist_labeling(plan->targetlist, root->processed_tlist);
+		apply_tlist_labeling(plan->targetlist, root->final_tlist);
 
 	/*
 	 * Attach any initPlans created in this query level to the topmost plan
@@ -2819,7 +2819,7 @@ create_modifytable_plan(PlannerInfo *root, ModifyTablePath *best_path)
 	subplan = create_plan_recurse(root, subpath, CP_EXACT_TLIST);
 
 	/* Transfer resname/resjunk labeling, too, to keep executor happy */
-	apply_tlist_labeling(subplan->targetlist, root->processed_tlist);
+	apply_tlist_labeling(subplan->targetlist, root->final_tlist);
 
 	plan = make_modifytable(root,
 							subplan,
diff --git a/src/backend/optimizer/plan/planagg.c b/src/backend/optimizer/plan/planagg.c
index 624afae6bce..05a6202c422 100644
--- a/src/backend/optimizer/plan/planagg.c
+++ b/src/backend/optimizer/plan/planagg.c
@@ -420,6 +420,8 @@ build_minmax_path(PlannerInfo *root, MinMaxAggInfo *mminfo,
 
 	final_rel = query_planner(subroot, minmax_qp_callback, NULL);
 
+	subroot->final_tlist = subroot->processed_tlist;
+
 	/*
 	 * Since we didn't go through subquery_planner() to handle the subquery,
 	 * we have to do some of the same cleanup it would do, in particular cope
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 6f45efde21d..e1c56a42607 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1301,6 +1301,7 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 	List	   *final_targets;
 	List	   *final_targets_contain_srfs;
 	bool		final_target_parallel_safe;
+	PathTarget *very_final_target;
 	RelOptInfo *current_rel;
 	RelOptInfo *final_rel;
 	FinalPathExtraData extra;
@@ -1358,6 +1359,17 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 			postprocess_setop_tlist(copyObject(root->processed_tlist),
 									parse->targetList);
 
+		root->final_tlist = NIL;
+		foreach (lc, root->processed_tlist)
+		{
+			TargetEntry *tle = (TargetEntry *) lfirst(lc);
+
+			if (tle->resjunk == JUNK_SORT_GROUP_COL || tle->resjunk == JUNK_PLANNER_ONLY)
+				continue;
+
+			root->final_tlist = lappend(root->final_tlist, tle);
+		}
+
 		/* Also extract the PathTarget form of the setop result tlist */
 		final_target = current_rel->cheapest_total_path->pathtarget;
 
@@ -1746,6 +1758,44 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 	final_rel->useridiscurrent = current_rel->useridiscurrent;
 	final_rel->fdwroutine = current_rel->fdwroutine;
 
+	/*
+	 * If the target list contains any junk columns that are not needed in the
+	 * executor, project them away. 'very_final_target' is the target list
+	 * with such columns removed.
+	 *
+	 * We used to let the executor filter these away with a "junk filter", if
+	 * the junk columns are expensive to compute, it's better to not compute
+	 * them in the first place. Usually we need such columns anyway, but one
+	 * case where we can avoid some real work is if we use an index to satisfy
+	 * the ORDER BY.
+	 *
+	 * XXX: The executor still has a junk filter and would filter these away
+	 * if we didn't. We could work a little harder here, and also add a
+	 * projection on top of the possible LockRows node, to remove any junk
+	 * columns created for row marks. With that, we could get rid of the junk
+	 * filter in the executor altogether.
+	 */
+	{
+		List	   *very_final_tlist = NIL;
+
+		foreach (lc, root->processed_tlist)
+		{
+			TargetEntry *tle = (TargetEntry *) lfirst(lc);
+
+			if (tle->resjunk == JUNK_SORT_GROUP_COL || tle->resjunk == JUNK_PLANNER_ONLY)
+				continue;
+
+			very_final_tlist = lappend(very_final_tlist, tle);
+		}
+		if (list_length(very_final_tlist) != list_length(root->processed_tlist))
+			very_final_target = create_pathtarget(root, very_final_tlist);
+		else
+			very_final_target = final_target;
+
+		/* Stash the final tlist that we will produce for create_plan() */
+		root->final_tlist = very_final_tlist;
+	}
+
 	/*
 	 * Generate paths for the final_rel.  Insert all surviving paths, with
 	 * LockRows, Limit, and/or ModifyTable steps added if needed.
@@ -1754,6 +1804,11 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 	{
 		Path	   *path = (Path *) lfirst(lc);
 
+		/* see comment above */
+		if (very_final_target != final_target)
+			path = apply_projection_to_path(root, final_rel,
+											path, very_final_target);
+
 		/*
 		 * If there is a FOR [KEY] UPDATE/SHARE clause, add the LockRows node.
 		 * (Note: we intentionally test parse->rowMarks not root->rowMarks
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 219040b52f0..0d736e2a5b3 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1774,7 +1774,13 @@ build_physical_tlist(PlannerInfo *root, RelOptInfo *rel)
 				/*
 				 * A resjunk column of the subquery can be reflected as
 				 * resjunk in the physical tlist; we need not punt.
+				 *
+				 * Subquery planner will filter out these junk columns
+				 * from the final plan, so reflect that here.
 				 */
+				if (tle->resjunk == JUNK_SORT_GROUP_COL || tle->resjunk == JUNK_PLANNER_ONLY)
+					continue;
+
 				var = makeVarFromTargetEntry(varno, tle);
 
 				tlist = lappend(tlist,
diff --git a/src/backend/optimizer/util/tlist.c b/src/backend/optimizer/util/tlist.c
index a2a343e960e..c28bcb9cdea 100644
--- a/src/backend/optimizer/util/tlist.c
+++ b/src/backend/optimizer/util/tlist.c
@@ -358,6 +358,26 @@ get_sortgroupref_tle(Index sortref, List *targetList)
 	return NULL;				/* keep compiler quiet */
 }
 
+/*
+ * get_sortgroupref_tle_noerr
+ *		As above, but return NULL rather than throwing an error if not found.
+ */
+TargetEntry *
+get_sortgroupref_tle_noerr(Index sortref, List *targetList)
+{
+	ListCell   *l;
+
+	foreach(l, targetList)
+	{
+		TargetEntry *tle = (TargetEntry *) lfirst(l);
+
+		if (tle->ressortgroupref == sortref)
+			return tle;
+	}
+
+	return NULL;
+}
+
 /*
  * get_sortgroupclause_tle
  *		Find the targetlist entry matching the given SortGroupClause
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index ed85dc7414b..c8668e0ee47 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -452,6 +452,9 @@ struct PlannerInfo
 	 */
 	List	   *processed_tlist;
 
+	/* same as processed_tlist, but with the planner JUNK_* cols removed */
+	List	   *final_tlist;
+
 	/*
 	 * For UPDATE, this list contains the target table's attribute numbers to
 	 * which the first N entries of processed_tlist are to be assigned.  (Any
diff --git a/src/include/optimizer/optimizer.h b/src/include/optimizer/optimizer.h
index da50044bf14..7514432b8c0 100644
--- a/src/include/optimizer/optimizer.h
+++ b/src/include/optimizer/optimizer.h
@@ -169,6 +169,8 @@ extern bool predicate_refuted_by(List *predicate_list, List *clause_list,
 extern int	count_nonjunk_tlist_entries(List *tlist);
 extern TargetEntry *get_sortgroupref_tle(Index sortref,
 										 List *targetList);
+extern TargetEntry *get_sortgroupref_tle_noerr(Index sortref,
+											   List *targetList);
 extern TargetEntry *get_sortgroupclause_tle(SortGroupClause *sgClause,
 											List *targetList);
 extern Node *get_sortgroupclause_expr(SortGroupClause *sgClause,
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 1e2222929f1..a12d38a8c8e 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -464,7 +464,7 @@ SELECT * FROM point_tbl ORDER BY f1 <-> '0,1';
                       QUERY PLAN                      
 ------------------------------------------------------
  Index Only Scan using gpointind on pg_temp.point_tbl
-   Output: f1, (f1 <-> '(0,1)'::point)
+   Output: f1
    Order By: (point_tbl.f1 <-> '(0,1)'::point)
 (3 rows)
 
diff --git a/src/test/regress/expected/gist.out b/src/test/regress/expected/gist.out
index 018b76f21d1..ed6722f48f7 100644
--- a/src/test/regress/expected/gist.out
+++ b/src/test/regress/expected/gist.out
@@ -81,7 +81,7 @@ order by p <-> point(0.201, 0.201);
                           QUERY PLAN                           
 ---------------------------------------------------------------
  Index Only Scan using gist_tbl_point_index on public.gist_tbl
-   Output: p, (p <-> '(0.201,0.201)'::point)
+   Output: p
    Index Cond: (gist_tbl.p <@ '(0.5,0.5),(0,0)'::box)
    Order By: (gist_tbl.p <-> '(0.201,0.201)'::point)
 (4 rows)
@@ -380,9 +380,9 @@ select p from gist_tbl order by circle(p,1) <-> point(0,0) limit 1;
                                      QUERY PLAN                                     
 ------------------------------------------------------------------------------------
  Limit
-   Output: p, ((circle(p, '1'::double precision) <-> '(0,0)'::point))
+   Output: p
    ->  Index Only Scan using gist_tbl_multi_index on public.gist_tbl
-         Output: p, (circle(p, '1'::double precision) <-> '(0,0)'::point)
+         Output: p
          Order By: ((circle(gist_tbl.p, '1'::double precision)) <-> '(0,0)'::point)
 (5 rows)
 
diff --git a/src/test/regress/expected/groupingsets.out b/src/test/regress/expected/groupingsets.out
index a3b9aaca84c..113f7a72b1f 100644
--- a/src/test/regress/expected/groupingsets.out
+++ b/src/test/regress/expected/groupingsets.out
@@ -472,10 +472,10 @@ select grouping(ss.x)
 from int8_tbl i1
 cross join lateral (select (select i1.q1) as x) ss
 group by ss.x;
-                   QUERY PLAN                   
-------------------------------------------------
+                 QUERY PLAN                 
+--------------------------------------------
  GroupAggregate
-   Output: GROUPING((SubPlan 1)), ((SubPlan 2))
+   Output: GROUPING((SubPlan 1))
    Group Key: ((SubPlan 2))
    ->  Sort
          Output: ((SubPlan 2)), i1.q1
@@ -505,7 +505,7 @@ group by ss.x;
                  QUERY PLAN                 
 --------------------------------------------
  GroupAggregate
-   Output: (SubPlan 2), ((SubPlan 3))
+   Output: (SubPlan 2)
    Group Key: ((SubPlan 3))
    ->  Sort
          Output: ((SubPlan 3)), i1.q1
diff --git a/src/test/regress/expected/limit.out b/src/test/regress/expected/limit.out
index a2cd0f9f5b8..60e572c53f2 100644
--- a/src/test/regress/expected/limit.out
+++ b/src/test/regress/expected/limit.out
@@ -401,12 +401,12 @@ select currval('testseq');
 explain (verbose, costs off)
 select unique1, unique2, nextval('testseq')
   from tenk1 order by tenthous limit 10;
-                                QUERY PLAN                                
---------------------------------------------------------------------------
+                           QUERY PLAN                           
+----------------------------------------------------------------
  Limit
-   Output: unique1, unique2, (nextval('testseq'::regclass)), tenthous
+   Output: unique1, unique2, (nextval('testseq'::regclass))
    ->  Result
-         Output: unique1, unique2, nextval('testseq'::regclass), tenthous
+         Output: unique1, unique2, nextval('testseq'::regclass)
          ->  Sort
                Output: unique1, unique2, tenthous
                Sort Key: tenk1.tenthous
@@ -535,12 +535,12 @@ order by s2 desc;
 explain (verbose, costs off)
 select sum(tenthous) as s1, sum(tenthous) + random()*0 as s2
   from tenk1 group by thousand order by thousand limit 3;
-                                                    QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
+                                               QUERY PLAN                                                
+---------------------------------------------------------------------------------------------------------
  Limit
-   Output: (sum(tenthous)), (((sum(tenthous))::double precision + (random() * '0'::double precision))), thousand
+   Output: (sum(tenthous)), (((sum(tenthous))::double precision + (random() * '0'::double precision)))
    ->  GroupAggregate
-         Output: sum(tenthous), ((sum(tenthous))::double precision + (random() * '0'::double precision)), thousand
+         Output: sum(tenthous), ((sum(tenthous))::double precision + (random() * '0'::double precision))
          Group Key: tenk1.thousand
          ->  Index Only Scan using tenk1_thous_tenthous on public.tenk1
                Output: thousand, tenthous
diff --git a/src/test/regress/expected/sqljson.out b/src/test/regress/expected/sqljson.out
index 5e7da96be5e..505d7f46500 100644
--- a/src/test/regress/expected/sqljson.out
+++ b/src/test/regress/expected/sqljson.out
@@ -1002,10 +1002,10 @@ FROM generate_series(1,5) i;
 EXPLAIN (VERBOSE, COSTS OFF)
 SELECT JSON_OBJECTAGG(i: ('111' || i)::bytea FORMAT JSON WITH UNIQUE RETURNING text) OVER (PARTITION BY i % 2)
 FROM generate_series(1,5) i;
-                                                            QUERY PLAN                                                             
------------------------------------------------------------------------------------------------------------------------------------
+                                                       QUERY PLAN                                                       
+------------------------------------------------------------------------------------------------------------------------
  WindowAgg
-   Output: JSON_OBJECTAGG(i : (('111'::text || (i)::text))::bytea FORMAT JSON WITH UNIQUE KEYS RETURNING text) OVER (?), ((i % 2))
+   Output: JSON_OBJECTAGG(i : (('111'::text || (i)::text))::bytea FORMAT JSON WITH UNIQUE KEYS RETURNING text) OVER (?)
    ->  Sort
          Output: ((i % 2)), i
          Sort Key: ((i.i % 2))
@@ -1038,10 +1038,10 @@ FROM generate_series(1,5) i;
 EXPLAIN (VERBOSE, COSTS OFF)
 SELECT JSON_ARRAYAGG(('111' || i)::bytea FORMAT JSON NULL ON NULL RETURNING text) OVER (PARTITION BY i % 2)
 FROM generate_series(1,5) i;
-                                                        QUERY PLAN                                                        
---------------------------------------------------------------------------------------------------------------------------
+                                                  QUERY PLAN                                                   
+---------------------------------------------------------------------------------------------------------------
  WindowAgg
-   Output: JSON_ARRAYAGG((('111'::text || (i)::text))::bytea FORMAT JSON NULL ON NULL RETURNING text) OVER (?), ((i % 2))
+   Output: JSON_ARRAYAGG((('111'::text || (i)::text))::bytea FORMAT JSON NULL ON NULL RETURNING text) OVER (?)
    ->  Sort
          Output: ((i % 2)), i
          Sort Key: ((i.i % 2))
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index a3a4d03d104..6cdc1ffcf87 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -1636,7 +1636,7 @@ select * from explain_sq_limit();
                         explain_sq_limit                        
 ----------------------------------------------------------------
  Limit (actual rows=3 loops=1)
-   ->  Subquery Scan on x (actual rows=3 loops=1)
+   ->  Result (actual rows=3 loops=1)
          ->  Sort (actual rows=3 loops=1)
                Sort Key: sq_limit.c1, sq_limit.pk
                Sort Method: top-N heapsort  Memory: xxx
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 2201740c185..2f0ff7b08db 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -3731,10 +3731,10 @@ SELECT
     count(*) OVER (PARTITION BY depname ORDER BY enroll_date RANGE BETWEEN
                    CURRENT ROW AND CURRENT ROW) cnt
 FROM empsalary;
-                                              QUERY PLAN                                              
-------------------------------------------------------------------------------------------------------
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
  WindowAgg
-   Output: empno, depname, (row_number() OVER (?)), (rank() OVER (?)), count(*) OVER (?), enroll_date
+   Output: empno, depname, (row_number() OVER (?)), (rank() OVER (?)), count(*) OVER (?)
    ->  WindowAgg
          Output: depname, enroll_date, empno, row_number() OVER (?), rank() OVER (?)
          ->  Sort
@@ -3777,16 +3777,15 @@ SELECT * FROM
           min(salary) OVER (PARTITION BY depname || 'A', depname) depminsalary
    FROM empsalary) emp
 WHERE depname = 'sales';
-                                QUERY PLAN                                
---------------------------------------------------------------------------
- Subquery Scan on emp
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ WindowAgg
    ->  WindowAgg
-         ->  WindowAgg
-               ->  Sort
-                     Sort Key: (((empsalary.depname)::text || 'A'::text))
-                     ->  Seq Scan on empsalary
-                           Filter: ((depname)::text = 'sales'::text)
-(7 rows)
+         ->  Sort
+               Sort Key: (((empsalary.depname)::text || 'A'::text))
+               ->  Seq Scan on empsalary
+                     Filter: ((depname)::text = 'sales'::text)
+(6 rows)
 
 -- pushdown is unsafe because there's a PARTITION BY clause without depname:
 EXPLAIN (COSTS OFF)
@@ -4147,23 +4146,22 @@ SELECT * FROM
           ntile(2) OVER (PARTITION BY depname) nt -- w2
    FROM empsalary
 ) e WHERE rn <= 1 AND c1 <= 3 AND nt < 2;
-                                          QUERY PLAN                                           
------------------------------------------------------------------------------------------------
- Subquery Scan on e
-   ->  WindowAgg
-         Filter: (((row_number() OVER (?)) <= 1) AND ((ntile(2) OVER (?)) < 2))
-         Run Condition: (count(empsalary.salary) OVER (?) <= 3)
-         ->  Sort
-               Sort Key: (((empsalary.depname)::text || ''::text))
-               ->  WindowAgg
-                     Run Condition: ((row_number() OVER (?) <= 1) AND (ntile(2) OVER (?) < 2))
-                     ->  Sort
-                           Sort Key: empsalary.depname
-                           ->  WindowAgg
-                                 ->  Sort
-                                       Sort Key: ((''::text || (empsalary.depname)::text))
-                                       ->  Seq Scan on empsalary
-(14 rows)
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
+ WindowAgg
+   Filter: (((row_number() OVER (?)) <= 1) AND ((ntile(2) OVER (?)) < 2))
+   Run Condition: (count(empsalary.salary) OVER (?) <= 3)
+   ->  Sort
+         Sort Key: (((empsalary.depname)::text || ''::text))
+         ->  WindowAgg
+               Run Condition: ((row_number() OVER (?) <= 1) AND (ntile(2) OVER (?) < 2))
+               ->  Sort
+                     Sort Key: empsalary.depname
+                     ->  WindowAgg
+                           ->  Sort
+                                 Sort Key: ((''::text || (empsalary.depname)::text))
+                                 ->  Seq Scan on empsalary
+(13 rows)
 
 -- Ensure we correctly filter out all of the run conditions from each window
 SELECT * FROM
@@ -4268,16 +4266,15 @@ SELECT * FROM
           min(salary) OVER (PARTITION BY depname, empno order by enroll_date) depminsalary
    FROM empsalary) emp
 WHERE depname = 'sales';
-                              QUERY PLAN                              
-----------------------------------------------------------------------
- Subquery Scan on emp
+                           QUERY PLAN                           
+----------------------------------------------------------------
+ WindowAgg
    ->  WindowAgg
-         ->  WindowAgg
-               ->  Sort
-                     Sort Key: empsalary.empno, empsalary.enroll_date
-                     ->  Seq Scan on empsalary
-                           Filter: ((depname)::text = 'sales'::text)
-(7 rows)
+         ->  Sort
+               Sort Key: empsalary.empno, empsalary.enroll_date
+               ->  Seq Scan on empsalary
+                     Filter: ((depname)::text = 'sales'::text)
+(6 rows)
 
 -- Ensure that the evaluation order of the WindowAggs results in the WindowAgg
 -- with the same sort order that's required by the ORDER BY is evaluated last.
@@ -5330,14 +5327,13 @@ AS $$
     WINDOW w AS (ORDER BY s ROWS BETWEEN CURRENT ROW AND GROUP_SIZE FOLLOWING)
 $$ LANGUAGE SQL STABLE;
 EXPLAIN (costs off) SELECT * FROM pg_temp.f(2);
-                      QUERY PLAN                      
-------------------------------------------------------
- Subquery Scan on f
-   ->  WindowAgg
-         ->  Sort
-               Sort Key: s.s
-               ->  Function Scan on generate_series s
-(5 rows)
+                   QUERY PLAN                   
+------------------------------------------------
+ WindowAgg
+   ->  Sort
+         Sort Key: s.s
+         ->  Function Scan on generate_series s
+(4 rows)
 
 SELECT * FROM pg_temp.f(2);
     f    
-- 
2.39.2

v1-0005-Fix-regression-tests-caused-by-additional-Result-.patchtext/x-patch; charset=UTF-8; name=v1-0005-Fix-regression-tests-caused-by-additional-Result-.patchDownload

From f076f861abb9c4d7a3f8943830aa98792b642609 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Thu, 21 Dec 2023 19:03:15 +0200
Subject: [PATCH v1 5/5] Fix regression tests caused by additional Result nodes
 on top of Sorts et al.

We not add an extra Result node to project away the unneeded ORDER BY
junk columns. This may seem bad from a performance point of view, but
remember that previously the junk filter in the executor was doing
basically the same work anyway. It's just represented in the plan tree
now.

Perhaps we should teach Sort and MergeAppend nodes how to project to
avoid this churn.

(This commit should be squashed with the previous commit that caused
these diffs, but I'm keeping them separate for now to make it easier
to see the effect of the previous commit.)
---
 src/test/regress/expected/aggregates.out      |  35 +--
 src/test/regress/expected/collate.out         |  13 +-
 src/test/regress/expected/create_am.out       |  15 +-
 src/test/regress/expected/create_index.out    |  34 +--
 src/test/regress/expected/geometry.out        |  30 +-
 src/test/regress/expected/groupingsets.out    |  44 +--
 src/test/regress/expected/inherit.out         | 256 +++++++++---------
 src/test/regress/expected/join.out            |  48 ++--
 src/test/regress/expected/limit.out           |  22 +-
 .../regress/expected/partition_aggregate.out  | 180 ++++++------
 src/test/regress/expected/partition_join.out  |  37 +--
 src/test/regress/expected/partition_prune.out |  57 ++--
 src/test/regress/expected/rowsecurity.out     |  18 +-
 src/test/regress/expected/select_parallel.out |  63 +++--
 src/test/regress/expected/tsrf.out            |  24 +-
 15 files changed, 459 insertions(+), 417 deletions(-)

diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index e7814b7527b..2388176579d 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -735,22 +735,24 @@ explain (verbose, costs off)
 select array(select sum(x+y) s
             from generate_series(1,3) y group by y order by s)
   from generate_series(1,3) x;
-                            QUERY PLAN                             
--------------------------------------------------------------------
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
  Function Scan on pg_catalog.generate_series x
    Output: (SubPlan 1)
    Function Call: generate_series(1, 3)
    SubPlan 1
-     ->  Sort
-           Output: (sum((x.x + y.y))), y.y
-           Sort Key: (sum((x.x + y.y)))
-           ->  HashAggregate
-                 Output: sum((x.x + y.y)), y.y
-                 Group Key: y.y
-                 ->  Function Scan on pg_catalog.generate_series y
-                       Output: y.y
-                       Function Call: generate_series(1, 3)
-(13 rows)
+     ->  Result
+           Output: (sum((x.x + y.y)))
+           ->  Sort
+                 Output: (sum((x.x + y.y))), y.y
+                 Sort Key: (sum((x.x + y.y)))
+                 ->  HashAggregate
+                       Output: sum((x.x + y.y)), y.y
+                       Group Key: y.y
+                       ->  Function Scan on pg_catalog.generate_series y
+                             Output: y.y
+                             Function Call: generate_series(1, 3)
+(15 rows)
 
 select array(select sum(x+y) s
             from generate_series(1,3) y group by y order by s)
@@ -1147,14 +1149,15 @@ explain (costs off)
   select max(unique2) from tenk1 order by max(unique2)+1;
                              QUERY PLAN                              
 ---------------------------------------------------------------------
- Sort
-   Sort Key: (($0 + 1))
+ Result
    InitPlan 1 (returns $0)
      ->  Limit
            ->  Index Only Scan Backward using tenk1_unique2 on tenk1
                  Index Cond: (unique2 IS NOT NULL)
-   ->  Result
-(7 rows)
+   ->  Sort
+         Sort Key: (($0 + 1))
+         ->  Result
+(8 rows)
 
 select max(unique2) from tenk1 order by max(unique2)+1;
  max  
diff --git a/src/test/regress/expected/collate.out b/src/test/regress/expected/collate.out
index 06495644852..2cf3e371179 100644
--- a/src/test/regress/expected/collate.out
+++ b/src/test/regress/expected/collate.out
@@ -642,12 +642,13 @@ EXPLAIN (COSTS OFF)
 
 EXPLAIN (COSTS OFF)
   SELECT * FROM collate_test10 ORDER BY x DESC, y COLLATE "C" ASC NULLS FIRST;
-                        QUERY PLAN                         
------------------------------------------------------------
- Sort
-   Sort Key: x COLLATE "C" DESC, y COLLATE "C" NULLS FIRST
-   ->  Seq Scan on collate_test10
-(3 rows)
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Result
+   ->  Sort
+         Sort Key: x COLLATE "C" DESC, y COLLATE "C" NULLS FIRST
+         ->  Seq Scan on collate_test10
+(4 rows)
 
 -- CREATE/DROP COLLATION
 CREATE COLLATION mycoll1 FROM "C";
diff --git a/src/test/regress/expected/create_am.out b/src/test/regress/expected/create_am.out
index b50293d514f..6d70b1f0418 100644
--- a/src/test/regress/expected/create_am.out
+++ b/src/test/regress/expected/create_am.out
@@ -46,13 +46,14 @@ EXPLAIN (COSTS OFF)
 SELECT * FROM fast_emp4000
     WHERE home_base <@ '(200,200),(2000,1000)'::box
     ORDER BY (home_base[0])[0];
-                           QUERY PLAN                            
------------------------------------------------------------------
- Sort
-   Sort Key: ((home_base[0])[0])
-   ->  Index Only Scan using grect2ind2 on fast_emp4000
-         Index Cond: (home_base <@ '(2000,1000),(200,200)'::box)
-(4 rows)
+                              QUERY PLAN                               
+-----------------------------------------------------------------------
+ Result
+   ->  Sort
+         Sort Key: ((home_base[0])[0])
+         ->  Index Only Scan using grect2ind2 on fast_emp4000
+               Index Cond: (home_base <@ '(2000,1000),(200,200)'::box)
+(5 rows)
 
 SELECT * FROM fast_emp4000
     WHERE home_base <@ '(200,200),(2000,1000)'::box
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index a12d38a8c8e..908e030d2d7 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -247,13 +247,14 @@ EXPLAIN (COSTS OFF)
 SELECT * FROM fast_emp4000
     WHERE home_base <@ '(200,200),(2000,1000)'::box
     ORDER BY (home_base[0])[0];
-                           QUERY PLAN                            
------------------------------------------------------------------
- Sort
-   Sort Key: ((home_base[0])[0])
-   ->  Index Only Scan using grect2ind on fast_emp4000
-         Index Cond: (home_base <@ '(2000,1000),(200,200)'::box)
-(4 rows)
+                              QUERY PLAN                               
+-----------------------------------------------------------------------
+ Result
+   ->  Sort
+         Sort Key: ((home_base[0])[0])
+         ->  Index Only Scan using grect2ind on fast_emp4000
+               Index Cond: (home_base <@ '(2000,1000),(200,200)'::box)
+(5 rows)
 
 SELECT * FROM fast_emp4000
     WHERE home_base <@ '(200,200),(2000,1000)'::box
@@ -622,15 +623,16 @@ SET enable_indexscan = OFF;
 SET enable_bitmapscan = ON;
 EXPLAIN (COSTS OFF)
 SELECT * FROM point_tbl WHERE f1 <@ '(-10,-10),(10,10)':: box ORDER BY f1 <-> '0,1';
-                         QUERY PLAN                         
-------------------------------------------------------------
- Sort
-   Sort Key: ((f1 <-> '(0,1)'::point))
-   ->  Bitmap Heap Scan on point_tbl
-         Recheck Cond: (f1 <@ '(10,10),(-10,-10)'::box)
-         ->  Bitmap Index Scan on gpointind
-               Index Cond: (f1 <@ '(10,10),(-10,-10)'::box)
-(6 rows)
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Result
+   ->  Sort
+         Sort Key: ((f1 <-> '(0,1)'::point))
+         ->  Bitmap Heap Scan on point_tbl
+               Recheck Cond: (f1 <@ '(10,10),(-10,-10)'::box)
+               ->  Bitmap Index Scan on gpointind
+                     Index Cond: (f1 <@ '(10,10),(-10,-10)'::box)
+(7 rows)
 
 SELECT * FROM point_tbl WHERE f1 <@ '(-10,-10),(10,10)':: box ORDER BY f1 <-> '0,1';
         f1        
diff --git a/src/test/regress/expected/geometry.out b/src/test/regress/expected/geometry.out
index 8be694f46be..843b3664b4f 100644
--- a/src/test/regress/expected/geometry.out
+++ b/src/test/regress/expected/geometry.out
@@ -5250,13 +5250,14 @@ SELECT * FROM circle_tbl WHERE f1 && circle(point(1,-2), 1)
 EXPLAIN (COSTS OFF)
 SELECT * FROM circle_tbl WHERE f1 && circle(point(1,-2), 1)
     ORDER BY area(f1);
-                  QUERY PLAN                  
-----------------------------------------------
- Sort
-   Sort Key: (area(f1))
-   ->  Seq Scan on circle_tbl
-         Filter: (f1 && '<(1,-2),1>'::circle)
-(4 rows)
+                     QUERY PLAN                     
+----------------------------------------------------
+ Result
+   ->  Sort
+         Sort Key: (area(f1))
+         ->  Seq Scan on circle_tbl
+               Filter: (f1 && '<(1,-2),1>'::circle)
+(5 rows)
 
 SELECT * FROM circle_tbl WHERE f1 && circle(point(1,-2), 1)
     ORDER BY area(f1);
@@ -5280,13 +5281,14 @@ SELECT * FROM polygon_tbl WHERE f1 @> '((1,1),(2,2),(2,1))'::polygon
 EXPLAIN (COSTS OFF)
 SELECT * FROM polygon_tbl WHERE f1 @> '((1,1),(2,2),(2,1))'::polygon
     ORDER BY (poly_center(f1))[0];
-                       QUERY PLAN                       
---------------------------------------------------------
- Sort
-   Sort Key: ((poly_center(f1))[0])
-   ->  Seq Scan on polygon_tbl
-         Filter: (f1 @> '((1,1),(2,2),(2,1))'::polygon)
-(4 rows)
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Result
+   ->  Sort
+         Sort Key: ((poly_center(f1))[0])
+         ->  Seq Scan on polygon_tbl
+               Filter: (f1 @> '((1,1),(2,2),(2,1))'::polygon)
+(5 rows)
 
 SELECT * FROM polygon_tbl WHERE f1 @> '((1,1),(2,2),(2,1))'::polygon
     ORDER BY (poly_center(f1))[0];
diff --git a/src/test/regress/expected/groupingsets.out b/src/test/regress/expected/groupingsets.out
index 113f7a72b1f..77bc41d9dd9 100644
--- a/src/test/regress/expected/groupingsets.out
+++ b/src/test/regress/expected/groupingsets.out
@@ -1204,17 +1204,18 @@ explain (costs off)
          count(*), sum(v)
     from gstest4 group by grouping sets ((v,unhashable_col),(v,unsortable_col))
    order by 3,5;
-                            QUERY PLAN                            
-------------------------------------------------------------------
- Sort
-   Sort Key: (GROUPING(unhashable_col, unsortable_col)), (sum(v))
-   ->  MixedAggregate
-         Hash Key: v, unsortable_col
-         Group Key: v, unhashable_col
-         ->  Sort
-               Sort Key: v, unhashable_col
-               ->  Seq Scan on gstest4
-(8 rows)
+                               QUERY PLAN                               
+------------------------------------------------------------------------
+ Result
+   ->  Sort
+         Sort Key: (GROUPING(unhashable_col, unsortable_col)), (sum(v))
+         ->  MixedAggregate
+               Hash Key: v, unsortable_col
+               Group Key: v, unhashable_col
+               ->  Sort
+                     Sort Key: v, unhashable_col
+                     ->  Seq Scan on gstest4
+(9 rows)
 
 -- empty input: first is 0 rows, second 1, third 3 etc.
 select a, b, sum(v), count(*) from gstest_empty group by grouping sets ((a,b),a);
@@ -1376,16 +1377,17 @@ select a, b, grouping(a,b), sum(v), count(*), max(v)
 explain (costs off)
   select a, b, grouping(a,b), sum(v), count(*), max(v)
     from gstest1 group by grouping sets ((a,b),(a+1,b+1),(a+2,b+2)) order by 3,6;
-                                        QUERY PLAN                                         
--------------------------------------------------------------------------------------------
- Sort
-   Sort Key: (GROUPING("*VALUES*".column1, "*VALUES*".column2)), (max("*VALUES*".column3))
-   ->  HashAggregate
-         Hash Key: "*VALUES*".column1, "*VALUES*".column2
-         Hash Key: ("*VALUES*".column1 + 1), ("*VALUES*".column2 + 1)
-         Hash Key: ("*VALUES*".column1 + 2), ("*VALUES*".column2 + 2)
-         ->  Values Scan on "*VALUES*"
-(7 rows)
+                                           QUERY PLAN                                            
+-------------------------------------------------------------------------------------------------
+ Result
+   ->  Sort
+         Sort Key: (GROUPING("*VALUES*".column1, "*VALUES*".column2)), (max("*VALUES*".column3))
+         ->  HashAggregate
+               Hash Key: "*VALUES*".column1, "*VALUES*".column2
+               Hash Key: ("*VALUES*".column1 + 1), ("*VALUES*".column2 + 1)
+               Hash Key: ("*VALUES*".column1 + 2), ("*VALUES*".column2 + 2)
+               ->  Values Scan on "*VALUES*"
+(8 rows)
 
 select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
   from gstest2 group by cube (a,b) order by rsum, a, b;
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index 0f1aa831f64..9ff8edac407 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1539,23 +1539,25 @@ insert into matest3 (name) values ('Test 5');
 insert into matest3 (name) values ('Test 6');
 set enable_indexscan = off;  -- force use of seqscan/sort, so no merge
 explain (verbose, costs off) select * from matest0 order by 1-id;
-                         QUERY PLAN                         
-------------------------------------------------------------
- Sort
-   Output: matest0.id, matest0.name, ((1 - matest0.id))
-   Sort Key: ((1 - matest0.id))
-   ->  Result
-         Output: matest0.id, matest0.name, (1 - matest0.id)
-         ->  Append
-               ->  Seq Scan on public.matest0 matest0_1
-                     Output: matest0_1.id, matest0_1.name
-               ->  Seq Scan on public.matest1 matest0_2
-                     Output: matest0_2.id, matest0_2.name
-               ->  Seq Scan on public.matest2 matest0_3
-                     Output: matest0_3.id, matest0_3.name
-               ->  Seq Scan on public.matest3 matest0_4
-                     Output: matest0_4.id, matest0_4.name
-(14 rows)
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Result
+   Output: matest0.id, matest0.name
+   ->  Sort
+         Output: matest0.id, matest0.name, ((1 - matest0.id))
+         Sort Key: ((1 - matest0.id))
+         ->  Result
+               Output: matest0.id, matest0.name, (1 - matest0.id)
+               ->  Append
+                     ->  Seq Scan on public.matest0 matest0_1
+                           Output: matest0_1.id, matest0_1.name
+                     ->  Seq Scan on public.matest1 matest0_2
+                           Output: matest0_2.id, matest0_2.name
+                     ->  Seq Scan on public.matest2 matest0_3
+                           Output: matest0_3.id, matest0_3.name
+                     ->  Seq Scan on public.matest3 matest0_4
+                           Output: matest0_4.id, matest0_4.name
+(16 rows)
 
 select * from matest0 order by 1-id;
  id |  name  
@@ -1594,22 +1596,24 @@ reset enable_indexscan;
 set enable_seqscan = off;  -- plan with fewest seqscans should be merge
 set enable_parallel_append = off; -- Don't let parallel-append interfere
 explain (verbose, costs off) select * from matest0 order by 1-id;
-                               QUERY PLAN                               
-------------------------------------------------------------------------
- Merge Append
-   Sort Key: ((1 - matest0.id))
-   ->  Index Scan using matest0i on public.matest0 matest0_1
-         Output: matest0_1.id, matest0_1.name, (1 - matest0_1.id)
-   ->  Index Scan using matest1i on public.matest1 matest0_2
-         Output: matest0_2.id, matest0_2.name, (1 - matest0_2.id)
-   ->  Sort
-         Output: matest0_3.id, matest0_3.name, ((1 - matest0_3.id))
-         Sort Key: ((1 - matest0_3.id))
-         ->  Seq Scan on public.matest2 matest0_3
-               Output: matest0_3.id, matest0_3.name, (1 - matest0_3.id)
-   ->  Index Scan using matest3i on public.matest3 matest0_4
-         Output: matest0_4.id, matest0_4.name, (1 - matest0_4.id)
-(13 rows)
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Result
+   Output: matest0.id, matest0.name
+   ->  Merge Append
+         Sort Key: ((1 - matest0.id))
+         ->  Index Scan using matest0i on public.matest0 matest0_1
+               Output: matest0_1.id, matest0_1.name, (1 - matest0_1.id)
+         ->  Index Scan using matest1i on public.matest1 matest0_2
+               Output: matest0_2.id, matest0_2.name, (1 - matest0_2.id)
+         ->  Sort
+               Output: matest0_3.id, matest0_3.name, ((1 - matest0_3.id))
+               Sort Key: ((1 - matest0_3.id))
+               ->  Seq Scan on public.matest2 matest0_3
+                     Output: matest0_3.id, matest0_3.name, (1 - matest0_3.id)
+         ->  Index Scan using matest3i on public.matest3 matest0_4
+               Output: matest0_4.id, matest0_4.name, (1 - matest0_4.id)
+(15 rows)
 
 select * from matest0 order by 1-id;
  id |  name  
@@ -2827,46 +2831,49 @@ drop table parted_minmax;
 create index mcrparted_a_abs_c_idx on mcrparted (a, abs(b), c);
 -- MergeAppend must be used when a default partition exists
 explain (costs off) select * from mcrparted order by a, abs(b), c;
-                                  QUERY PLAN                                   
--------------------------------------------------------------------------------
- Merge Append
-   Sort Key: mcrparted.a, (abs(mcrparted.b)), mcrparted.c
-   ->  Index Scan using mcrparted0_a_abs_c_idx on mcrparted0 mcrparted_1
-   ->  Index Scan using mcrparted1_a_abs_c_idx on mcrparted1 mcrparted_2
-   ->  Index Scan using mcrparted2_a_abs_c_idx on mcrparted2 mcrparted_3
-   ->  Index Scan using mcrparted3_a_abs_c_idx on mcrparted3 mcrparted_4
-   ->  Index Scan using mcrparted4_a_abs_c_idx on mcrparted4 mcrparted_5
-   ->  Index Scan using mcrparted5_a_abs_c_idx on mcrparted5 mcrparted_6
-   ->  Index Scan using mcrparted_def_a_abs_c_idx on mcrparted_def mcrparted_7
-(9 rows)
+                                     QUERY PLAN                                      
+-------------------------------------------------------------------------------------
+ Result
+   ->  Merge Append
+         Sort Key: mcrparted.a, (abs(mcrparted.b)), mcrparted.c
+         ->  Index Scan using mcrparted0_a_abs_c_idx on mcrparted0 mcrparted_1
+         ->  Index Scan using mcrparted1_a_abs_c_idx on mcrparted1 mcrparted_2
+         ->  Index Scan using mcrparted2_a_abs_c_idx on mcrparted2 mcrparted_3
+         ->  Index Scan using mcrparted3_a_abs_c_idx on mcrparted3 mcrparted_4
+         ->  Index Scan using mcrparted4_a_abs_c_idx on mcrparted4 mcrparted_5
+         ->  Index Scan using mcrparted5_a_abs_c_idx on mcrparted5 mcrparted_6
+         ->  Index Scan using mcrparted_def_a_abs_c_idx on mcrparted_def mcrparted_7
+(10 rows)
 
 drop table mcrparted_def;
 -- Append is used for a RANGE partitioned table with no default
 -- and no subpartitions
 explain (costs off) select * from mcrparted order by a, abs(b), c;
-                               QUERY PLAN                                
--------------------------------------------------------------------------
- Append
-   ->  Index Scan using mcrparted0_a_abs_c_idx on mcrparted0 mcrparted_1
-   ->  Index Scan using mcrparted1_a_abs_c_idx on mcrparted1 mcrparted_2
-   ->  Index Scan using mcrparted2_a_abs_c_idx on mcrparted2 mcrparted_3
-   ->  Index Scan using mcrparted3_a_abs_c_idx on mcrparted3 mcrparted_4
-   ->  Index Scan using mcrparted4_a_abs_c_idx on mcrparted4 mcrparted_5
-   ->  Index Scan using mcrparted5_a_abs_c_idx on mcrparted5 mcrparted_6
-(7 rows)
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
+ Result
+   ->  Append
+         ->  Index Scan using mcrparted0_a_abs_c_idx on mcrparted0 mcrparted_1
+         ->  Index Scan using mcrparted1_a_abs_c_idx on mcrparted1 mcrparted_2
+         ->  Index Scan using mcrparted2_a_abs_c_idx on mcrparted2 mcrparted_3
+         ->  Index Scan using mcrparted3_a_abs_c_idx on mcrparted3 mcrparted_4
+         ->  Index Scan using mcrparted4_a_abs_c_idx on mcrparted4 mcrparted_5
+         ->  Index Scan using mcrparted5_a_abs_c_idx on mcrparted5 mcrparted_6
+(8 rows)
 
 -- Append is used with subpaths in reverse order with backwards index scans
 explain (costs off) select * from mcrparted order by a desc, abs(b) desc, c desc;
-                                    QUERY PLAN                                    
-----------------------------------------------------------------------------------
- Append
-   ->  Index Scan Backward using mcrparted5_a_abs_c_idx on mcrparted5 mcrparted_6
-   ->  Index Scan Backward using mcrparted4_a_abs_c_idx on mcrparted4 mcrparted_5
-   ->  Index Scan Backward using mcrparted3_a_abs_c_idx on mcrparted3 mcrparted_4
-   ->  Index Scan Backward using mcrparted2_a_abs_c_idx on mcrparted2 mcrparted_3
-   ->  Index Scan Backward using mcrparted1_a_abs_c_idx on mcrparted1 mcrparted_2
-   ->  Index Scan Backward using mcrparted0_a_abs_c_idx on mcrparted0 mcrparted_1
-(7 rows)
+                                       QUERY PLAN                                       
+----------------------------------------------------------------------------------------
+ Result
+   ->  Append
+         ->  Index Scan Backward using mcrparted5_a_abs_c_idx on mcrparted5 mcrparted_6
+         ->  Index Scan Backward using mcrparted4_a_abs_c_idx on mcrparted4 mcrparted_5
+         ->  Index Scan Backward using mcrparted3_a_abs_c_idx on mcrparted3 mcrparted_4
+         ->  Index Scan Backward using mcrparted2_a_abs_c_idx on mcrparted2 mcrparted_3
+         ->  Index Scan Backward using mcrparted1_a_abs_c_idx on mcrparted1 mcrparted_2
+         ->  Index Scan Backward using mcrparted0_a_abs_c_idx on mcrparted0 mcrparted_1
+(8 rows)
 
 -- check that Append plan is used containing a MergeAppend for sub-partitions
 -- that are unordered.
@@ -2875,51 +2882,54 @@ create table mcrparted5 partition of mcrparted for values from (20, 20, 20) to (
 create table mcrparted5a partition of mcrparted5 for values in(20);
 create table mcrparted5_def partition of mcrparted5 default;
 explain (costs off) select * from mcrparted order by a, abs(b), c;
-                                      QUERY PLAN                                       
----------------------------------------------------------------------------------------
- Append
-   ->  Index Scan using mcrparted0_a_abs_c_idx on mcrparted0 mcrparted_1
-   ->  Index Scan using mcrparted1_a_abs_c_idx on mcrparted1 mcrparted_2
-   ->  Index Scan using mcrparted2_a_abs_c_idx on mcrparted2 mcrparted_3
-   ->  Index Scan using mcrparted3_a_abs_c_idx on mcrparted3 mcrparted_4
-   ->  Index Scan using mcrparted4_a_abs_c_idx on mcrparted4 mcrparted_5
-   ->  Merge Append
-         Sort Key: mcrparted_7.a, (abs(mcrparted_7.b)), mcrparted_7.c
-         ->  Index Scan using mcrparted5a_a_abs_c_idx on mcrparted5a mcrparted_7
-         ->  Index Scan using mcrparted5_def_a_abs_c_idx on mcrparted5_def mcrparted_8
-(10 rows)
+                                         QUERY PLAN                                          
+---------------------------------------------------------------------------------------------
+ Result
+   ->  Append
+         ->  Index Scan using mcrparted0_a_abs_c_idx on mcrparted0 mcrparted_1
+         ->  Index Scan using mcrparted1_a_abs_c_idx on mcrparted1 mcrparted_2
+         ->  Index Scan using mcrparted2_a_abs_c_idx on mcrparted2 mcrparted_3
+         ->  Index Scan using mcrparted3_a_abs_c_idx on mcrparted3 mcrparted_4
+         ->  Index Scan using mcrparted4_a_abs_c_idx on mcrparted4 mcrparted_5
+         ->  Merge Append
+               Sort Key: mcrparted_7.a, (abs(mcrparted_7.b)), mcrparted_7.c
+               ->  Index Scan using mcrparted5a_a_abs_c_idx on mcrparted5a mcrparted_7
+               ->  Index Scan using mcrparted5_def_a_abs_c_idx on mcrparted5_def mcrparted_8
+(11 rows)
 
 drop table mcrparted5_def;
 -- check that an Append plan is used and the sub-partitions are flattened
 -- into the main Append when the sub-partition is unordered but contains
 -- just a single sub-partition.
 explain (costs off) select a, abs(b) from mcrparted order by a, abs(b), c;
-                                QUERY PLAN                                 
----------------------------------------------------------------------------
- Append
-   ->  Index Scan using mcrparted0_a_abs_c_idx on mcrparted0 mcrparted_1
-   ->  Index Scan using mcrparted1_a_abs_c_idx on mcrparted1 mcrparted_2
-   ->  Index Scan using mcrparted2_a_abs_c_idx on mcrparted2 mcrparted_3
-   ->  Index Scan using mcrparted3_a_abs_c_idx on mcrparted3 mcrparted_4
-   ->  Index Scan using mcrparted4_a_abs_c_idx on mcrparted4 mcrparted_5
-   ->  Index Scan using mcrparted5a_a_abs_c_idx on mcrparted5a mcrparted_6
-(7 rows)
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
+ Result
+   ->  Append
+         ->  Index Scan using mcrparted0_a_abs_c_idx on mcrparted0 mcrparted_1
+         ->  Index Scan using mcrparted1_a_abs_c_idx on mcrparted1 mcrparted_2
+         ->  Index Scan using mcrparted2_a_abs_c_idx on mcrparted2 mcrparted_3
+         ->  Index Scan using mcrparted3_a_abs_c_idx on mcrparted3 mcrparted_4
+         ->  Index Scan using mcrparted4_a_abs_c_idx on mcrparted4 mcrparted_5
+         ->  Index Scan using mcrparted5a_a_abs_c_idx on mcrparted5a mcrparted_6
+(8 rows)
 
 -- check that Append is used when the sub-partitioned tables are pruned
 -- during planning.
 explain (costs off) select * from mcrparted where a < 20 order by a, abs(b), c;
-                               QUERY PLAN                                
--------------------------------------------------------------------------
- Append
-   ->  Index Scan using mcrparted0_a_abs_c_idx on mcrparted0 mcrparted_1
-         Index Cond: (a < 20)
-   ->  Index Scan using mcrparted1_a_abs_c_idx on mcrparted1 mcrparted_2
-         Index Cond: (a < 20)
-   ->  Index Scan using mcrparted2_a_abs_c_idx on mcrparted2 mcrparted_3
-         Index Cond: (a < 20)
-   ->  Index Scan using mcrparted3_a_abs_c_idx on mcrparted3 mcrparted_4
-         Index Cond: (a < 20)
-(9 rows)
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
+ Result
+   ->  Append
+         ->  Index Scan using mcrparted0_a_abs_c_idx on mcrparted0 mcrparted_1
+               Index Cond: (a < 20)
+         ->  Index Scan using mcrparted1_a_abs_c_idx on mcrparted1 mcrparted_2
+               Index Cond: (a < 20)
+         ->  Index Scan using mcrparted2_a_abs_c_idx on mcrparted2 mcrparted_3
+               Index Cond: (a < 20)
+         ->  Index Scan using mcrparted3_a_abs_c_idx on mcrparted3 mcrparted_4
+               Index Cond: (a < 20)
+(10 rows)
 
 set enable_bitmapscan to off;
 set enable_sort to off;
@@ -3066,34 +3076,36 @@ create index on mcrparted2 (a, abs(b), c);
 create index on mcrparted3 (a, abs(b), c);
 create index on mcrparted4 (a, abs(b), c);
 explain (costs off) select * from mcrparted where a < 20 order by a, abs(b), c limit 1;
-                                  QUERY PLAN                                   
--------------------------------------------------------------------------------
+                                     QUERY PLAN                                      
+-------------------------------------------------------------------------------------
  Limit
-   ->  Append
-         ->  Sort
-               Sort Key: mcrparted_1.a, (abs(mcrparted_1.b)), mcrparted_1.c
-               ->  Seq Scan on mcrparted0 mcrparted_1
-                     Filter: (a < 20)
-         ->  Index Scan using mcrparted1_a_abs_c_idx on mcrparted1 mcrparted_2
-               Index Cond: (a < 20)
-         ->  Index Scan using mcrparted2_a_abs_c_idx on mcrparted2 mcrparted_3
-               Index Cond: (a < 20)
-         ->  Index Scan using mcrparted3_a_abs_c_idx on mcrparted3 mcrparted_4
-               Index Cond: (a < 20)
-(12 rows)
+   ->  Result
+         ->  Append
+               ->  Sort
+                     Sort Key: mcrparted_1.a, (abs(mcrparted_1.b)), mcrparted_1.c
+                     ->  Seq Scan on mcrparted0 mcrparted_1
+                           Filter: (a < 20)
+               ->  Index Scan using mcrparted1_a_abs_c_idx on mcrparted1 mcrparted_2
+                     Index Cond: (a < 20)
+               ->  Index Scan using mcrparted2_a_abs_c_idx on mcrparted2 mcrparted_3
+                     Index Cond: (a < 20)
+               ->  Index Scan using mcrparted3_a_abs_c_idx on mcrparted3 mcrparted_4
+                     Index Cond: (a < 20)
+(13 rows)
 
 set enable_bitmapscan = 0;
 -- Ensure Append node can be used when the partition is ordered by some
 -- pathkeys which were deemed redundant.
 explain (costs off) select * from mcrparted where a = 10 order by a, abs(b), c;
-                               QUERY PLAN                                
--------------------------------------------------------------------------
- Append
-   ->  Index Scan using mcrparted1_a_abs_c_idx on mcrparted1 mcrparted_1
-         Index Cond: (a = 10)
-   ->  Index Scan using mcrparted2_a_abs_c_idx on mcrparted2 mcrparted_2
-         Index Cond: (a = 10)
-(5 rows)
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
+ Result
+   ->  Append
+         ->  Index Scan using mcrparted1_a_abs_c_idx on mcrparted1 mcrparted_1
+               Index Cond: (a = 10)
+         ->  Index Scan using mcrparted2_a_abs_c_idx on mcrparted2 mcrparted_2
+               Index Cond: (a = 10)
+(6 rows)
 
 reset enable_bitmapscan;
 drop table mcrparted;
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 2c73270143b..9708afefbfc 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -8209,30 +8209,32 @@ select t1.b, ss.phv from join_ut1 t1 left join lateral
               (select t2.a as t2a, t3.a t3a, least(t1.a, t2.a, t3.a) phv
 					  from join_pt1 t2 join join_ut1 t3 on t2.a = t3.b) ss
               on t1.a = ss.t2a order by t1.a;
-                             QUERY PLAN                             
---------------------------------------------------------------------
- Sort
-   Output: t1.b, (LEAST(t1.a, t2.a, t3.a)), t1.a
-   Sort Key: t1.a
-   ->  Nested Loop Left Join
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
+ Result
+   Output: t1.b, (LEAST(t1.a, t2.a, t3.a))
+   ->  Sort
          Output: t1.b, (LEAST(t1.a, t2.a, t3.a)), t1.a
-         ->  Seq Scan on public.join_ut1 t1
-               Output: t1.a, t1.b, t1.c
-         ->  Hash Join
-               Output: t2.a, LEAST(t1.a, t2.a, t3.a)
-               Hash Cond: (t3.b = t2.a)
-               ->  Seq Scan on public.join_ut1 t3
-                     Output: t3.a, t3.b, t3.c
-               ->  Hash
-                     Output: t2.a
-                     ->  Append
-                           ->  Seq Scan on public.join_pt1p1p1 t2_1
-                                 Output: t2_1.a
-                                 Filter: (t1.a = t2_1.a)
-                           ->  Seq Scan on public.join_pt1p2 t2_2
-                                 Output: t2_2.a
-                                 Filter: (t1.a = t2_2.a)
-(21 rows)
+         Sort Key: t1.a
+         ->  Nested Loop Left Join
+               Output: t1.b, (LEAST(t1.a, t2.a, t3.a)), t1.a
+               ->  Seq Scan on public.join_ut1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash Join
+                     Output: t2.a, LEAST(t1.a, t2.a, t3.a)
+                     Hash Cond: (t3.b = t2.a)
+                     ->  Seq Scan on public.join_ut1 t3
+                           Output: t3.a, t3.b, t3.c
+                     ->  Hash
+                           Output: t2.a
+                           ->  Append
+                                 ->  Seq Scan on public.join_pt1p1p1 t2_1
+                                       Output: t2_1.a
+                                       Filter: (t1.a = t2_1.a)
+                                 ->  Seq Scan on public.join_pt1p2 t2_2
+                                       Output: t2_2.a
+                                       Filter: (t1.a = t2_2.a)
+(23 rows)
 
 select t1.b, ss.phv from join_ut1 t1 left join lateral
               (select t2.a as t2a, t3.a t3a, least(t1.a, t2.a, t3.a) phv
diff --git a/src/test/regress/expected/limit.out b/src/test/regress/expected/limit.out
index 60e572c53f2..5aab9cdd6f7 100644
--- a/src/test/regress/expected/limit.out
+++ b/src/test/regress/expected/limit.out
@@ -465,18 +465,20 @@ select unique1, unique2, generate_series(1,10)
 explain (verbose, costs off)
 select unique1, unique2, generate_series(1,10)
   from tenk1 order by tenthous limit 7;
-                             QUERY PLAN                             
---------------------------------------------------------------------
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
  Limit
-   Output: unique1, unique2, (generate_series(1, 10)), tenthous
-   ->  ProjectSet
-         Output: unique1, unique2, generate_series(1, 10), tenthous
-         ->  Sort
-               Output: unique1, unique2, tenthous
-               Sort Key: tenk1.tenthous
-               ->  Seq Scan on public.tenk1
+   Output: unique1, unique2, (generate_series(1, 10))
+   ->  Result
+         Output: unique1, unique2, (generate_series(1, 10))
+         ->  ProjectSet
+               Output: unique1, unique2, generate_series(1, 10), tenthous
+               ->  Sort
                      Output: unique1, unique2, tenthous
-(9 rows)
+                     Sort Key: tenk1.tenthous
+                     ->  Seq Scan on public.tenk1
+                           Output: unique1, unique2, tenthous
+(11 rows)
 
 select unique1, unique2, generate_series(1,10)
   from tenk1 order by tenthous limit 7;
diff --git a/src/test/regress/expected/partition_aggregate.out b/src/test/regress/expected/partition_aggregate.out
index 1b900fddf8e..462fe6cb775 100644
--- a/src/test/regress/expected/partition_aggregate.out
+++ b/src/test/regress/expected/partition_aggregate.out
@@ -862,24 +862,25 @@ SELECT a, sum(b), avg(c), count(*) FROM pagg_tab_m GROUP BY a HAVING avg(c) < 22
 -- Full aggregation as GROUP BY clause matches with PARTITION KEY
 EXPLAIN (COSTS OFF)
 SELECT a, sum(b), avg(c), count(*) FROM pagg_tab_m GROUP BY a, (a+b)/2 HAVING sum(b) < 50 ORDER BY 1, 2, 3;
-                                    QUERY PLAN                                    
-----------------------------------------------------------------------------------
- Sort
-   Sort Key: pagg_tab_m.a, (sum(pagg_tab_m.b)), (avg(pagg_tab_m.c))
-   ->  Append
-         ->  HashAggregate
-               Group Key: pagg_tab_m.a, ((pagg_tab_m.a + pagg_tab_m.b) / 2)
-               Filter: (sum(pagg_tab_m.b) < 50)
-               ->  Seq Scan on pagg_tab_m_p1 pagg_tab_m
-         ->  HashAggregate
-               Group Key: pagg_tab_m_1.a, ((pagg_tab_m_1.a + pagg_tab_m_1.b) / 2)
-               Filter: (sum(pagg_tab_m_1.b) < 50)
-               ->  Seq Scan on pagg_tab_m_p2 pagg_tab_m_1
-         ->  HashAggregate
-               Group Key: pagg_tab_m_2.a, ((pagg_tab_m_2.a + pagg_tab_m_2.b) / 2)
-               Filter: (sum(pagg_tab_m_2.b) < 50)
-               ->  Seq Scan on pagg_tab_m_p3 pagg_tab_m_2
-(15 rows)
+                                       QUERY PLAN                                       
+----------------------------------------------------------------------------------------
+ Result
+   ->  Sort
+         Sort Key: pagg_tab_m.a, (sum(pagg_tab_m.b)), (avg(pagg_tab_m.c))
+         ->  Append
+               ->  HashAggregate
+                     Group Key: pagg_tab_m.a, ((pagg_tab_m.a + pagg_tab_m.b) / 2)
+                     Filter: (sum(pagg_tab_m.b) < 50)
+                     ->  Seq Scan on pagg_tab_m_p1 pagg_tab_m
+               ->  HashAggregate
+                     Group Key: pagg_tab_m_1.a, ((pagg_tab_m_1.a + pagg_tab_m_1.b) / 2)
+                     Filter: (sum(pagg_tab_m_1.b) < 50)
+                     ->  Seq Scan on pagg_tab_m_p2 pagg_tab_m_1
+               ->  HashAggregate
+                     Group Key: pagg_tab_m_2.a, ((pagg_tab_m_2.a + pagg_tab_m_2.b) / 2)
+                     Filter: (sum(pagg_tab_m_2.b) < 50)
+                     ->  Seq Scan on pagg_tab_m_p3 pagg_tab_m_2
+(16 rows)
 
 SELECT a, sum(b), avg(c), count(*) FROM pagg_tab_m GROUP BY a, (a+b)/2 HAVING sum(b) < 50 ORDER BY 1, 2, 3;
  a  | sum |         avg         | count 
@@ -895,24 +896,25 @@ SELECT a, sum(b), avg(c), count(*) FROM pagg_tab_m GROUP BY a, (a+b)/2 HAVING su
 -- Full aggregation as PARTITION KEY is part of GROUP BY clause
 EXPLAIN (COSTS OFF)
 SELECT a, c, sum(b), avg(c), count(*) FROM pagg_tab_m GROUP BY (a+b)/2, 2, 1 HAVING sum(b) = 50 AND avg(c) > 25 ORDER BY 1, 2, 3;
-                                            QUERY PLAN                                            
---------------------------------------------------------------------------------------------------
- Sort
-   Sort Key: pagg_tab_m.a, pagg_tab_m.c, (sum(pagg_tab_m.b))
-   ->  Append
-         ->  HashAggregate
-               Group Key: ((pagg_tab_m.a + pagg_tab_m.b) / 2), pagg_tab_m.c, pagg_tab_m.a
-               Filter: ((sum(pagg_tab_m.b) = 50) AND (avg(pagg_tab_m.c) > '25'::numeric))
-               ->  Seq Scan on pagg_tab_m_p1 pagg_tab_m
-         ->  HashAggregate
-               Group Key: ((pagg_tab_m_1.a + pagg_tab_m_1.b) / 2), pagg_tab_m_1.c, pagg_tab_m_1.a
-               Filter: ((sum(pagg_tab_m_1.b) = 50) AND (avg(pagg_tab_m_1.c) > '25'::numeric))
-               ->  Seq Scan on pagg_tab_m_p2 pagg_tab_m_1
-         ->  HashAggregate
-               Group Key: ((pagg_tab_m_2.a + pagg_tab_m_2.b) / 2), pagg_tab_m_2.c, pagg_tab_m_2.a
-               Filter: ((sum(pagg_tab_m_2.b) = 50) AND (avg(pagg_tab_m_2.c) > '25'::numeric))
-               ->  Seq Scan on pagg_tab_m_p3 pagg_tab_m_2
-(15 rows)
+                                               QUERY PLAN                                               
+--------------------------------------------------------------------------------------------------------
+ Result
+   ->  Sort
+         Sort Key: pagg_tab_m.a, pagg_tab_m.c, (sum(pagg_tab_m.b))
+         ->  Append
+               ->  HashAggregate
+                     Group Key: ((pagg_tab_m.a + pagg_tab_m.b) / 2), pagg_tab_m.c, pagg_tab_m.a
+                     Filter: ((sum(pagg_tab_m.b) = 50) AND (avg(pagg_tab_m.c) > '25'::numeric))
+                     ->  Seq Scan on pagg_tab_m_p1 pagg_tab_m
+               ->  HashAggregate
+                     Group Key: ((pagg_tab_m_1.a + pagg_tab_m_1.b) / 2), pagg_tab_m_1.c, pagg_tab_m_1.a
+                     Filter: ((sum(pagg_tab_m_1.b) = 50) AND (avg(pagg_tab_m_1.c) > '25'::numeric))
+                     ->  Seq Scan on pagg_tab_m_p2 pagg_tab_m_1
+               ->  HashAggregate
+                     Group Key: ((pagg_tab_m_2.a + pagg_tab_m_2.b) / 2), pagg_tab_m_2.c, pagg_tab_m_2.a
+                     Filter: ((sum(pagg_tab_m_2.b) = 50) AND (avg(pagg_tab_m_2.c) > '25'::numeric))
+                     ->  Seq Scan on pagg_tab_m_p3 pagg_tab_m_2
+(16 rows)
 
 SELECT a, c, sum(b), avg(c), count(*) FROM pagg_tab_m GROUP BY (a+b)/2, 2, 1 HAVING sum(b) = 50 AND avg(c) > 25 ORDER BY 1, 2, 3;
  a  | c  | sum |         avg         | count 
@@ -1123,32 +1125,33 @@ SELECT b, sum(a), count(*) FROM pagg_tab_ml GROUP BY b HAVING avg(a) < 15 ORDER
 -- Full aggregation at all levels as GROUP BY clause matches with PARTITION KEY
 EXPLAIN (COSTS OFF)
 SELECT a, sum(b), count(*) FROM pagg_tab_ml GROUP BY a, b, c HAVING avg(b) > 7 ORDER BY 1, 2, 3;
-                                 QUERY PLAN                                 
-----------------------------------------------------------------------------
- Sort
-   Sort Key: pagg_tab_ml.a, (sum(pagg_tab_ml.b)), (count(*))
-   ->  Append
-         ->  HashAggregate
-               Group Key: pagg_tab_ml.a, pagg_tab_ml.b, pagg_tab_ml.c
-               Filter: (avg(pagg_tab_ml.b) > '7'::numeric)
-               ->  Seq Scan on pagg_tab_ml_p1 pagg_tab_ml
-         ->  HashAggregate
-               Group Key: pagg_tab_ml_1.a, pagg_tab_ml_1.b, pagg_tab_ml_1.c
-               Filter: (avg(pagg_tab_ml_1.b) > '7'::numeric)
-               ->  Seq Scan on pagg_tab_ml_p2_s1 pagg_tab_ml_1
-         ->  HashAggregate
-               Group Key: pagg_tab_ml_2.a, pagg_tab_ml_2.b, pagg_tab_ml_2.c
-               Filter: (avg(pagg_tab_ml_2.b) > '7'::numeric)
-               ->  Seq Scan on pagg_tab_ml_p2_s2 pagg_tab_ml_2
-         ->  HashAggregate
-               Group Key: pagg_tab_ml_3.a, pagg_tab_ml_3.b, pagg_tab_ml_3.c
-               Filter: (avg(pagg_tab_ml_3.b) > '7'::numeric)
-               ->  Seq Scan on pagg_tab_ml_p3_s1 pagg_tab_ml_3
-         ->  HashAggregate
-               Group Key: pagg_tab_ml_4.a, pagg_tab_ml_4.b, pagg_tab_ml_4.c
-               Filter: (avg(pagg_tab_ml_4.b) > '7'::numeric)
-               ->  Seq Scan on pagg_tab_ml_p3_s2 pagg_tab_ml_4
-(23 rows)
+                                    QUERY PLAN                                    
+----------------------------------------------------------------------------------
+ Result
+   ->  Sort
+         Sort Key: pagg_tab_ml.a, (sum(pagg_tab_ml.b)), (count(*))
+         ->  Append
+               ->  HashAggregate
+                     Group Key: pagg_tab_ml.a, pagg_tab_ml.b, pagg_tab_ml.c
+                     Filter: (avg(pagg_tab_ml.b) > '7'::numeric)
+                     ->  Seq Scan on pagg_tab_ml_p1 pagg_tab_ml
+               ->  HashAggregate
+                     Group Key: pagg_tab_ml_1.a, pagg_tab_ml_1.b, pagg_tab_ml_1.c
+                     Filter: (avg(pagg_tab_ml_1.b) > '7'::numeric)
+                     ->  Seq Scan on pagg_tab_ml_p2_s1 pagg_tab_ml_1
+               ->  HashAggregate
+                     Group Key: pagg_tab_ml_2.a, pagg_tab_ml_2.b, pagg_tab_ml_2.c
+                     Filter: (avg(pagg_tab_ml_2.b) > '7'::numeric)
+                     ->  Seq Scan on pagg_tab_ml_p2_s2 pagg_tab_ml_2
+               ->  HashAggregate
+                     Group Key: pagg_tab_ml_3.a, pagg_tab_ml_3.b, pagg_tab_ml_3.c
+                     Filter: (avg(pagg_tab_ml_3.b) > '7'::numeric)
+                     ->  Seq Scan on pagg_tab_ml_p3_s1 pagg_tab_ml_3
+               ->  HashAggregate
+                     Group Key: pagg_tab_ml_4.a, pagg_tab_ml_4.b, pagg_tab_ml_4.c
+                     Filter: (avg(pagg_tab_ml_4.b) > '7'::numeric)
+                     ->  Seq Scan on pagg_tab_ml_p3_s2 pagg_tab_ml_4
+(24 rows)
 
 SELECT a, sum(b), count(*) FROM pagg_tab_ml GROUP BY a, b, c HAVING avg(b) > 7 ORDER BY 1, 2, 3;
  a  | sum  | count 
@@ -1279,34 +1282,35 @@ SELECT b, sum(a), count(*) FROM pagg_tab_ml GROUP BY b HAVING avg(a) < 15 ORDER
 -- Full aggregation at all levels as GROUP BY clause matches with PARTITION KEY
 EXPLAIN (COSTS OFF)
 SELECT a, sum(b), count(*) FROM pagg_tab_ml GROUP BY a, b, c HAVING avg(b) > 7 ORDER BY 1, 2, 3;
-                                    QUERY PLAN                                    
-----------------------------------------------------------------------------------
+                                       QUERY PLAN                                       
+----------------------------------------------------------------------------------------
  Gather Merge
    Workers Planned: 2
-   ->  Sort
-         Sort Key: pagg_tab_ml.a, (sum(pagg_tab_ml.b)), (count(*))
-         ->  Parallel Append
-               ->  HashAggregate
-                     Group Key: pagg_tab_ml.a, pagg_tab_ml.b, pagg_tab_ml.c
-                     Filter: (avg(pagg_tab_ml.b) > '7'::numeric)
-                     ->  Seq Scan on pagg_tab_ml_p1 pagg_tab_ml
-               ->  HashAggregate
-                     Group Key: pagg_tab_ml_3.a, pagg_tab_ml_3.b, pagg_tab_ml_3.c
-                     Filter: (avg(pagg_tab_ml_3.b) > '7'::numeric)
-                     ->  Seq Scan on pagg_tab_ml_p3_s1 pagg_tab_ml_3
-               ->  HashAggregate
-                     Group Key: pagg_tab_ml_1.a, pagg_tab_ml_1.b, pagg_tab_ml_1.c
-                     Filter: (avg(pagg_tab_ml_1.b) > '7'::numeric)
-                     ->  Seq Scan on pagg_tab_ml_p2_s1 pagg_tab_ml_1
-               ->  HashAggregate
-                     Group Key: pagg_tab_ml_4.a, pagg_tab_ml_4.b, pagg_tab_ml_4.c
-                     Filter: (avg(pagg_tab_ml_4.b) > '7'::numeric)
-                     ->  Seq Scan on pagg_tab_ml_p3_s2 pagg_tab_ml_4
-               ->  HashAggregate
-                     Group Key: pagg_tab_ml_2.a, pagg_tab_ml_2.b, pagg_tab_ml_2.c
-                     Filter: (avg(pagg_tab_ml_2.b) > '7'::numeric)
-                     ->  Seq Scan on pagg_tab_ml_p2_s2 pagg_tab_ml_2
-(25 rows)
+   ->  Result
+         ->  Sort
+               Sort Key: pagg_tab_ml.a, (sum(pagg_tab_ml.b)), (count(*))
+               ->  Parallel Append
+                     ->  HashAggregate
+                           Group Key: pagg_tab_ml.a, pagg_tab_ml.b, pagg_tab_ml.c
+                           Filter: (avg(pagg_tab_ml.b) > '7'::numeric)
+                           ->  Seq Scan on pagg_tab_ml_p1 pagg_tab_ml
+                     ->  HashAggregate
+                           Group Key: pagg_tab_ml_3.a, pagg_tab_ml_3.b, pagg_tab_ml_3.c
+                           Filter: (avg(pagg_tab_ml_3.b) > '7'::numeric)
+                           ->  Seq Scan on pagg_tab_ml_p3_s1 pagg_tab_ml_3
+                     ->  HashAggregate
+                           Group Key: pagg_tab_ml_1.a, pagg_tab_ml_1.b, pagg_tab_ml_1.c
+                           Filter: (avg(pagg_tab_ml_1.b) > '7'::numeric)
+                           ->  Seq Scan on pagg_tab_ml_p2_s1 pagg_tab_ml_1
+                     ->  HashAggregate
+                           Group Key: pagg_tab_ml_4.a, pagg_tab_ml_4.b, pagg_tab_ml_4.c
+                           Filter: (avg(pagg_tab_ml_4.b) > '7'::numeric)
+                           ->  Seq Scan on pagg_tab_ml_p3_s2 pagg_tab_ml_4
+                     ->  HashAggregate
+                           Group Key: pagg_tab_ml_2.a, pagg_tab_ml_2.b, pagg_tab_ml_2.c
+                           Filter: (avg(pagg_tab_ml_2.b) > '7'::numeric)
+                           ->  Seq Scan on pagg_tab_ml_p2_s2 pagg_tab_ml_2
+(26 rows)
 
 SELECT a, sum(b), count(*) FROM pagg_tab_ml GROUP BY a, b, c HAVING avg(b) > 7 ORDER BY 1, 2, 3;
  a  | sum  | count 
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index 6560fe2416f..c0edb08cbad 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -111,25 +111,26 @@ SELECT COUNT(*) FROM prt1 t1
 -- left outer join, with whole-row reference; partitionwise join does not apply
 EXPLAIN (COSTS OFF)
 SELECT t1, t2 FROM prt1 t1 LEFT JOIN prt2 t2 ON t1.a = t2.b WHERE t1.b = 0 ORDER BY t1.a, t2.b;
-                    QUERY PLAN                    
---------------------------------------------------
- Sort
-   Sort Key: t1.a, t2.b
-   ->  Hash Right Join
-         Hash Cond: (t2.b = t1.a)
-         ->  Append
-               ->  Seq Scan on prt2_p1 t2_1
-               ->  Seq Scan on prt2_p2 t2_2
-               ->  Seq Scan on prt2_p3 t2_3
-         ->  Hash
+                       QUERY PLAN                       
+--------------------------------------------------------
+ Result
+   ->  Sort
+         Sort Key: t1.a, t2.b
+         ->  Hash Right Join
+               Hash Cond: (t2.b = t1.a)
                ->  Append
-                     ->  Seq Scan on prt1_p1 t1_1
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_p2 t1_2
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_p3 t1_3
-                           Filter: (b = 0)
-(16 rows)
+                     ->  Seq Scan on prt2_p1 t2_1
+                     ->  Seq Scan on prt2_p2 t2_2
+                     ->  Seq Scan on prt2_p3 t2_3
+               ->  Hash
+                     ->  Append
+                           ->  Seq Scan on prt1_p1 t1_1
+                                 Filter: (b = 0)
+                           ->  Seq Scan on prt1_p2 t1_2
+                                 Filter: (b = 0)
+                           ->  Seq Scan on prt1_p3 t1_3
+                                 Filter: (b = 0)
+(17 rows)
 
 SELECT t1, t2 FROM prt1 t1 LEFT JOIN prt2 t2 ON t1.a = t2.b WHERE t1.b = 0 ORDER BY t1.a, t2.b;
       t1      |      t2      
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 9a4c48c0556..249825c4f33 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -3316,18 +3316,19 @@ create index on ma_test (b);
 analyze ma_test;
 prepare mt_q1 (int) as select a from ma_test where a >= $1 and a % 10 = 5 order by b;
 explain (analyze, costs off, summary off, timing off) execute mt_q1(15);
-                                       QUERY PLAN                                        
------------------------------------------------------------------------------------------
- Merge Append (actual rows=2 loops=1)
-   Sort Key: ma_test.b
-   Subplans Removed: 1
-   ->  Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_1 (actual rows=1 loops=1)
-         Filter: ((a >= $1) AND ((a % 10) = 5))
-         Rows Removed by Filter: 9
-   ->  Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_2 (actual rows=1 loops=1)
-         Filter: ((a >= $1) AND ((a % 10) = 5))
-         Rows Removed by Filter: 9
-(9 rows)
+                                          QUERY PLAN                                           
+-----------------------------------------------------------------------------------------------
+ Result (actual rows=2 loops=1)
+   ->  Merge Append (actual rows=2 loops=1)
+         Sort Key: ma_test.b
+         Subplans Removed: 1
+         ->  Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_1 (actual rows=1 loops=1)
+               Filter: ((a >= $1) AND ((a % 10) = 5))
+               Rows Removed by Filter: 9
+         ->  Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_2 (actual rows=1 loops=1)
+               Filter: ((a >= $1) AND ((a % 10) = 5))
+               Rows Removed by Filter: 9
+(10 rows)
 
 execute mt_q1(15);
  a  
@@ -3337,15 +3338,16 @@ execute mt_q1(15);
 (2 rows)
 
 explain (analyze, costs off, summary off, timing off) execute mt_q1(25);
-                                       QUERY PLAN                                        
------------------------------------------------------------------------------------------
- Merge Append (actual rows=1 loops=1)
-   Sort Key: ma_test.b
-   Subplans Removed: 2
-   ->  Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_1 (actual rows=1 loops=1)
-         Filter: ((a >= $1) AND ((a % 10) = 5))
-         Rows Removed by Filter: 9
-(6 rows)
+                                          QUERY PLAN                                           
+-----------------------------------------------------------------------------------------------
+ Result (actual rows=1 loops=1)
+   ->  Merge Append (actual rows=1 loops=1)
+         Sort Key: ma_test.b
+         Subplans Removed: 2
+         ->  Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_1 (actual rows=1 loops=1)
+               Filter: ((a >= $1) AND ((a % 10) = 5))
+               Rows Removed by Filter: 9
+(7 rows)
 
 execute mt_q1(25);
  a  
@@ -3355,12 +3357,13 @@ execute mt_q1(25);
 
 -- Ensure MergeAppend behaves correctly when no subplans match
 explain (analyze, costs off, summary off, timing off) execute mt_q1(35);
-              QUERY PLAN              
---------------------------------------
- Merge Append (actual rows=0 loops=1)
-   Sort Key: ma_test.b
-   Subplans Removed: 3
-(3 rows)
+                 QUERY PLAN                 
+--------------------------------------------
+ Result (actual rows=0 loops=1)
+   ->  Merge Append (actual rows=0 loops=1)
+         Sort Key: ma_test.b
+         Subplans Removed: 3
+(4 rows)
 
 execute mt_q1(35);
  a 
diff --git a/src/test/regress/expected/rowsecurity.out b/src/test/regress/expected/rowsecurity.out
index 6988128aa4c..fe82211f974 100644
--- a/src/test/regress/expected/rowsecurity.out
+++ b/src/test/regress/expected/rowsecurity.out
@@ -4437,18 +4437,20 @@ INSERT INTO rls_tbl SELECT 10, 20, 30;
 EXPLAIN (VERBOSE, COSTS OFF)
 INSERT INTO rls_tbl
   SELECT * FROM (SELECT b, c FROM rls_tbl ORDER BY a) ss;
-                             QUERY PLAN                             
---------------------------------------------------------------------
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
  Insert on regress_rls_schema.rls_tbl
    ->  Subquery Scan on ss
          Output: ss.b, ss.c, NULL::integer
-         ->  Sort
-               Output: rls_tbl_1.b, rls_tbl_1.c, rls_tbl_1.a
-               Sort Key: rls_tbl_1.a
-               ->  Seq Scan on regress_rls_schema.rls_tbl rls_tbl_1
+         ->  Result
+               Output: rls_tbl_1.b, rls_tbl_1.c
+               ->  Sort
                      Output: rls_tbl_1.b, rls_tbl_1.c, rls_tbl_1.a
-                     Filter: (rls_tbl_1.* >= '(1,1,1)'::record)
-(9 rows)
+                     Sort Key: rls_tbl_1.a
+                     ->  Seq Scan on regress_rls_schema.rls_tbl rls_tbl_1
+                           Output: rls_tbl_1.b, rls_tbl_1.c, rls_tbl_1.a
+                           Filter: (rls_tbl_1.* >= '(1,1,1)'::record)
+(11 rows)
 
 INSERT INTO rls_tbl
   SELECT * FROM (SELECT b, c FROM rls_tbl ORDER BY a) ss;
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index d88353d496c..da6cdf00fed 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -699,19 +699,20 @@ drop function sp_simple_func(integer);
 -- test handling of SRFs in targetlist (bug in 10.0)
 explain (costs off)
    select count(*), generate_series(1,2) from tenk1 group by twenty;
-                        QUERY PLAN                        
-----------------------------------------------------------
- ProjectSet
-   ->  Finalize GroupAggregate
-         Group Key: twenty
-         ->  Gather Merge
-               Workers Planned: 4
-               ->  Partial GroupAggregate
-                     Group Key: twenty
-                     ->  Sort
-                           Sort Key: twenty
-                           ->  Parallel Seq Scan on tenk1
-(10 rows)
+                           QUERY PLAN                           
+----------------------------------------------------------------
+ Result
+   ->  ProjectSet
+         ->  Finalize GroupAggregate
+               Group Key: twenty
+               ->  Gather Merge
+                     Workers Planned: 4
+                     ->  Partial GroupAggregate
+                           Group Key: twenty
+                           ->  Sort
+                                 Sort Key: twenty
+                                 ->  Parallel Seq Scan on tenk1
+(11 rows)
 
 select count(*), generate_series(1,2) from tenk1 group by twenty;
  count | generate_series 
@@ -1129,25 +1130,27 @@ ORDER BY 1, 2, 3;
 EXPLAIN (VERBOSE, COSTS OFF)
 SELECT generate_series(1, two), array(select generate_series(1, two))
   FROM tenk1 ORDER BY tenthous;
-                              QUERY PLAN                              
-----------------------------------------------------------------------
- ProjectSet
-   Output: generate_series(1, tenk1.two), (SubPlan 1), tenk1.tenthous
-   ->  Gather Merge
-         Output: tenk1.two, tenk1.tenthous
-         Workers Planned: 4
-         ->  Result
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
+ Result
+   Output: (generate_series(1, tenk1.two)), ((SubPlan 1))
+   ->  ProjectSet
+         Output: generate_series(1, tenk1.two), (SubPlan 1), tenk1.tenthous
+         ->  Gather Merge
                Output: tenk1.two, tenk1.tenthous
-               ->  Sort
-                     Output: tenk1.tenthous, tenk1.two
-                     Sort Key: tenk1.tenthous
-                     ->  Parallel Seq Scan on public.tenk1
+               Workers Planned: 4
+               ->  Result
+                     Output: tenk1.two, tenk1.tenthous
+                     ->  Sort
                            Output: tenk1.tenthous, tenk1.two
-   SubPlan 1
-     ->  ProjectSet
-           Output: generate_series(1, tenk1.two)
-           ->  Result
-(16 rows)
+                           Sort Key: tenk1.tenthous
+                           ->  Parallel Seq Scan on public.tenk1
+                                 Output: tenk1.tenthous, tenk1.two
+         SubPlan 1
+           ->  ProjectSet
+                 Output: generate_series(1, tenk1.two)
+                 ->  Result
+(18 rows)
 
 -- must disallow pushing sort below gather when pathkey contains an SRF
 EXPLAIN (VERBOSE, COSTS OFF)
diff --git a/src/test/regress/expected/tsrf.out b/src/test/regress/expected/tsrf.out
index d47b5f6ec57..98195e8c8c1 100644
--- a/src/test/regress/expected/tsrf.out
+++ b/src/test/regress/expected/tsrf.out
@@ -666,17 +666,19 @@ select generate_series(1,3) as x, generate_series(1,3) + 1 as xp1;
 
 explain (verbose, costs off)
 select generate_series(1,3)+1 order by generate_series(1,3);
-                               QUERY PLAN                               
-------------------------------------------------------------------------
- Sort
-   Output: (((generate_series(1, 3)) + 1)), (generate_series(1, 3))
-   Sort Key: (generate_series(1, 3))
-   ->  Result
-         Output: ((generate_series(1, 3)) + 1), (generate_series(1, 3))
-         ->  ProjectSet
-               Output: generate_series(1, 3)
-               ->  Result
-(8 rows)
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Result
+   Output: (((generate_series(1, 3)) + 1))
+   ->  Sort
+         Output: (((generate_series(1, 3)) + 1)), (generate_series(1, 3))
+         Sort Key: (generate_series(1, 3))
+         ->  Result
+               Output: ((generate_series(1, 3)) + 1), (generate_series(1, 3))
+               ->  ProjectSet
+                     Output: generate_series(1, 3)
+                     ->  Result
+(10 rows)
 
 select generate_series(1,3)+1 order by generate_series(1,3);
  ?column? 
-- 
2.39.2

Xiaoran Wang

fanfuxiaoran@gmail.com

about 2 years ago

In reply to: Heikki Linnakangas (#1)

Re: Avoid computing ORDER BY junk columns unnecessarily

Hi Heikii,
I haven't dug into your patch yet, but for this problem, I have another
idea.
-------
explain verbose
select foo from mytable order by sha256(bar::bytea);

QUERY PLAN

The index is used to satisfy the ORDER BY, but the expensive ORDER BY
expression is still computed for every row, just to be thrown away by
the junk filter.
------
How about adding the orderby column value in 'xs_heaptid' with the
'xs_heaptid'
together? So that we can use that value directly instead of computing it
when using
an index scan to fetch the ordered data.
Another problem I am concerned about is that if we exclude junk
columns in the sort plan, we may change its behavior. I'm not sure if it
can lead to some
other issues.

Heikki Linnakangas <hlinnaka@iki.fi> 于2023年12月22日周五 02:39写道：

Show quoted text

Problem
-------

We are using junk columns for (at least) two slightly different purposes:

1. For passing row IDs and other such data from lower plan nodes to
LockRows / ModifyTable.

2. To represent ORDER BY and GROUP BY columns that don't appear in the
SELECT list. For example, in a query like:

SELECT foo FROM mytable ORDER BY bar;

The parser adds 'bar' to the target list as a junk column. You can see
that with EXPLAIN VERBOSE:

explain (verbose, costs off)
select foo from mytable order by bar;

QUERY PLAN
----------------------------------
Sort
Output: foo, bar
Sort Key: mytable.bar
-> Seq Scan on public.mytable
Output: foo, bar
(5 rows)

The 'bar' column get filtered away in the executor, by the so-called
junk filter. That's fine for simple cases like the above, but in some
cases, that causes the ORDER BY value to be computed unnecessarily. For
example:

create table mytable (foo text, bar text);
insert into mytable select g, g from generate_series(1, 10000) g;
create index on mytable (sha256(bar::bytea));
explain verbose
select foo from mytable order by sha256(bar::bytea);

QUERY PLAN

------------------------------------------------------------------------------------------------
Index Scan using mytable_sha256_idx on public.mytable
(cost=0.29..737.28 rows=10000 width=64)
Output: foo, sha256((bar)::bytea)
(2 rows)

The index is used to satisfy the ORDER BY, but the expensive ORDER BY
expression is still computed for every row, just to be thrown away by
the junk filter.

This came up with pgvector, as the vector distance functions are pretty
expensive. All vector operations are expensive, so one extra distance
function call per row doesn't necessarily make that much difference, but
it sure looks silly. See
https://github.com/pgvector/pgvector/issues/359#issuecomment-1840786021
(thanks Matthias for the investigation!).

Solution
--------

The obvious solution is that the planner should not include those junk
columns in the plan. But how exactly to implement that is a different
matter.

I came up with the attached patch set, which adds a projection to all
the paths at the end of planning in grouping_planner(). The projection
filters out the unnecessary junk columns. With that, the plan for the
above example:

postgres=# explain verbose select foo from mytable order by
sha256(bar::bytea);
QUERY PLAN

-----------------------------------------------------------------------------------------------
Index Scan using mytable_sha256_idx on public.mytable
(cost=0.29..662.24 rows=10000 width=4)
Output: foo
(2 rows)

Problems with the solution
--------------------------

So this seems to work, but I have a few doubts:

1. Because Sort cannot project, this adds an extra Result node on top of
Sort nodes when the the ORDER BY is implemented by sorting:

postgres=# explain verbose select foo from mytable order by bar;
QUERY PLAN

--------------------------------------------------------------------------------
Result (cost=818.39..843.39 rows=10000 width=4)
Output: foo
-> Sort (cost=818.39..843.39 rows=10000 width=8)
Output: foo, bar
Sort Key: mytable.bar
-> Seq Scan on public.mytable (cost=0.00..154.00 rows=10000
width=8)
Output: foo, bar
(7 rows)

From a performance point of view, I think that's not as bad as it
sounds. Remember that without this patch, the executor needs to execute
the junk filter to filter out the extra column instead. It's not clear
that an extra Result is worse than that, although I haven't tried
benchmarking it though.

This makes plans for queries like above more verbose though. Perhaps we
should teach Sort (and MergeAppend) to do projection, just to avoid that?

Another solution would be to continue relying on the junk filter, if
adding the projection in the planner leads to an extra Result node.
That's a bit ugly, because then the final target list of a (sub)query
depends on the Path that's chosen.

2. Instead of tacking on the projection to the paths at the end, I first
tried modifying the code earlier in grouping_planner() that computes the
target lists for the different plan stages. That still feels like a
cleaner approach to me, although I don't think there's any difference in
the generated plans in practice. However I ran into some problems with
that approach and gave up.

I basically tried to remove the junk columns from 'final_target', and
have create_ordered_paths() create paths with the filtered target list
directly. And if there is no ORDER BY, leave out the junk columns from
'grouping_target' too, and have create_grouping_paths() generate the
final target list directly. However, create_grouping_paths() assumes
that the grouping columns are included in 'grouping_target'. And
similarly in create_ordered_paths(), some partial planning stages assume
that the ordering columns are included in 'final_target'. Those
assumptions could probably be fixed, but I ran out of steam trying to do
that.

3. I also considered if we should stop using junk columns to represent
ORDER BY / GROUP BY columns like this in the first place. Perhaps we
should have a separate list for those and not stash them in the target
list. But that seems like a much bigger change.

4. It would be nice to get rid of the junk filter in the executor
altogether. With this patch, the junk columns used for RowLocks and
ModifyTable are still included in the final target list, and are still
filtered out by the junk filter. But we could add similar projections
between the RowLocks and ModifyTable stages, to eliminate all the junk
columns at the top of the plan. ExecFilterJunk() isn't a lot of code,
but it would feel cleaner to me. I didn't try to implement that.

5. I'm not sure the categorization of junk columns that I implemented
here is the best one. It might make sense to have more categories, and
distinguish row-id columns from others for example. And ORDER BY columns
from GROUP BY columns.

Patches
-------

So the attached patches implement that idea, with the above-mentioned
problems. I think this approach is fine as it is, despite those
problems, but my planner-fu is a rusty so I really need review and a
second opinion on this.

v1-0001-Add-test-for-Min-Max-optimization-with-kNN-index-.patch
v1-0002-Show-how-ORDER-BY-expression-is-computed-unnecess.patch

These patches just add to existing tests to demonstrate the problem.

v1-0003-Turn-resjunk-into-an-enum.patch

Replace 'resjunk' boolean with an enum, so that we can distinguish
between different junk columns. The next patch uses that information to
identify junk columns that can be filtered out. It's is a separate patch
for easier review.

v1-0004-Omit-columns-from-final-tlist-that-were-only-need.patch

The main patch in this series.

v1-0005-Fix-regression-tests-caused-by-additional-Result-.patch

Regression test output changes, for all the plans with Sort that now
have Sort + Result. See "Problem with the solution" #1.

--
Heikki Linnakangas
Neon (https://neon.tech)

Heikki Linnakangas

hlinnaka@iki.fi

about 2 years ago

In reply to: Xiaoran Wang (#2)

Re: Avoid computing ORDER BY junk columns unnecessarily

On 22/12/2023 11:05, Xiaoran Wang wrote:

I haven't dug into your patch yet, but for this problem, I have another
idea.
-------
explain verbose
select foo from mytable order by sha256(bar::bytea);

QUERY PLAN

------------------------------------------------------------------------------------------------
Index Scan using mytable_sha256_idx on public.mytable
(cost=0.29..737.28 rows=10000 width=64)
Output: foo, sha256((bar)::bytea)
(2 rows)

The index is used to satisfy the ORDER BY, but the expensive ORDER BY
expression is still computed for every row, just to be thrown away by
the junk filter.
------
How about adding the orderby column value in 'xs_heaptid' with the
'xs_heaptid'
together? So that we can use that value directly instead of computing it
when using
an index scan to fetch the ordered data.

Hmm, so return the computed column from the index instead of recomputing
it? Yeah, that makes sense too and would help in this example. It won't
help in all cases though, the index might not store the original value
in the first place.

--
Heikki Linnakangas
Neon (https://neon.tech)

Tom Lane

tgl@sss.pgh.pa.us

about 2 years ago

In reply to: Heikki Linnakangas (#3)

Re: Avoid computing ORDER BY junk columns unnecessarily

Heikki Linnakangas <hlinnaka@iki.fi> writes:

Hmm, so return the computed column from the index instead of recomputing
it? Yeah, that makes sense too and would help in this example.

Yeah, that's been on the to-do list for ages. The main problems are
(1) we need the planner to not spend too much effort on looking for
subexpression matches, and (2) amcanreturn ability isn't implemented
by the executor in plain indexscans. There's another thread right now
discussing fixing (2), after which we could perhaps work on this.

It won't
help in all cases though, the index might not store the original value
in the first place.

I'm a little skeptical that an index could produce an accurate ORDER BY
result if it doesn't store the values-to-be-sorted exactly. Any loss
of information would compromise its ability to sort nearly-identical
values correctly. A more credible argument is that the index might
expose amcanorder ability but not amcanreturn; but what I'm saying is
that that's probably an AM implementation gap that ought to be fixed.

How much of your patchset still makes sense if we assume that we
can always extract the ORDER BY column values from the index?

regards, tom lane

Heikki Linnakangas

hlinnaka@iki.fi

about 2 years ago

In reply to: Tom Lane (#4)

Re: Avoid computing ORDER BY junk columns unnecessarily

On 22/12/2023 17:24, Tom Lane wrote:

Heikki Linnakangas <hlinnaka@iki.fi> writes:

It won't
help in all cases though, the index might not store the original value
in the first place.

I'm a little skeptical that an index could produce an accurate ORDER BY
result if it doesn't store the values-to-be-sorted exactly. Any loss
of information would compromise its ability to sort nearly-identical
values correctly.

In the context of pgvector, its ordering is approximate anyway. Aside
from that, there's one trick that it implements: it compares squares of
distances, avoiding a sqrt() calculation. (I wonder if we could do the
same in GiST opclasses)

A more credible argument is that the index might
expose amcanorder ability but not amcanreturn; but what I'm saying is
that that's probably an AM implementation gap that ought to be fixed.

How much of your patchset still makes sense if we assume that we
can always extract the ORDER BY column values from the index?

That would make it much less interesting. But I don't think that's a
good assumption. Especially in the kNN case, the ORDER BY value would
not be stored in the index. Most likely the index needs to calculate it
in some form, but it might take shortcuts like avoiding the sqrt().

--
Heikki Linnakangas
Neon (https://neon.tech)

Tom Lane

tgl@sss.pgh.pa.us

about 2 years ago

In reply to: Heikki Linnakangas (#5)

Re: Avoid computing ORDER BY junk columns unnecessarily

Heikki Linnakangas <hlinnaka@iki.fi> writes:

On 22/12/2023 17:24, Tom Lane wrote:

How much of your patchset still makes sense if we assume that we
can always extract the ORDER BY column values from the index?

That would make it much less interesting. But I don't think that's a
good assumption. Especially in the kNN case, the ORDER BY value would
not be stored in the index. Most likely the index needs to calculate it
in some form, but it might take shortcuts like avoiding the sqrt().

Yeah, fair point. I'll try to take a look at your patchset after
the holidays.

regards, tom lane

Richard Guo

guofenglinux@gmail.com

about 2 years ago

In reply to: Tom Lane (#6)

Re: Avoid computing ORDER BY junk columns unnecessarily

On Sat, Dec 23, 2023 at 1:32 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Heikki Linnakangas <hlinnaka@iki.fi> writes:

On 22/12/2023 17:24, Tom Lane wrote:

How much of your patchset still makes sense if we assume that we
can always extract the ORDER BY column values from the index?

That would make it much less interesting. But I don't think that's a
good assumption. Especially in the kNN case, the ORDER BY value would
not be stored in the index. Most likely the index needs to calculate it
in some form, but it might take shortcuts like avoiding the sqrt().

Yeah, fair point. I'll try to take a look at your patchset after
the holidays.

Agreed.

I haven't looked into these patches, but it seems that there is an issue
with how the targetlist is handled for foreign rels. The following test
case for postgres_fdw hits the Assert in apply_tlist_labeling().

contrib_regression=# SELECT c3, c4 FROM ft1 ORDER BY c3, c1 LIMIT 1;
server closed the connection unexpectedly

When we create foreign final path in add_foreign_final_paths(), we use
root->upper_targets[UPPERREL_FINAL] as the PathTarget. This PathTarget
is set in grouping_planner(), without removing the junk columns. I
think this is why the above query hits the Assert.

Thanks
Richard

Richard Guo

guofenglinux@gmail.com

about 2 years ago

In reply to: Heikki Linnakangas (#1)

Re: Avoid computing ORDER BY junk columns unnecessarily

On Fri, Dec 22, 2023 at 2:38 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

v1-0004-Omit-columns-from-final-tlist-that-were-only-need.patch

The main patch in this series.

This patch filters out the junk columns created for ORDER BY/GROUP BY,
and retains the junk columns created for RowLocks. I'm afraid this may
have a problem about out-of-order resnos. For instance,

create table mytable (foo text, bar text);

# explain select foo from mytable order by bar for update of mytable;
server closed the connection unexpectedly

This query triggers another Assert in apply_tlist_labeling:

Assert(dest_tle->resno == src_tle->resno);

At first there are three TargetEntry items: foo, bar and ctid, with
resnos being 1, 2 and 3. And then the second item 'bar' is removed,
leaving only two items: foo and ctid, with resnos being 1 and 3. So now
we have a missing resno, and finally hit the Assert.

Thanks
Richard

Tom Lane

tgl@sss.pgh.pa.us

about 2 years ago

In reply to: Tom Lane (#6)

Re: Avoid computing ORDER BY junk columns unnecessarily

I wrote:

Yeah, fair point. I'll try to take a look at your patchset after
the holidays.

I spent some time looking at this patch, and I'm not very pleased
with it. My basic complaint is that this is a band-aid that only
touches things at a surface level, whereas I think we need a much
deeper set of changes in order to have a plausible solution.
Specifically, you're only munging the final top-level targetlist
not anything for lower plan levels. That looks like it works in
simple cases, but it leaves everything on the table as soon as
there's more than one level of plan involved. I didn't stop to
trace the details but I'm pretty sure this is why you're getting the
bogus extra projections shown in the 0005 patch. Moreover, this
isn't doing anything for cost estimates, which means the planner
doesn't really understand that more-desirable plans are more
desirable, and it may not pick an available plan that would exploit
what we want to have happen here.

As an example, say we have an index on f(x) and the query requires
sorting by f(x) but not final output of f(x). If we devise a plan
that uses the index to sort and doesn't emit f(x), we need to not
charge the evaluation cost of f() for that plan --- this might
make the difference between picking that plan and not. Right now
I believe we'll charge all plans for f(), so that some other plan
might look cheaper even when f() is very expensive.

Another example: we might be using an indexscan but not relying on
its sort order, for example because its output is going into a hash
join and then we'll sort afterwards. For expensive f() it would
still be useful if the index could emit f(x) so that we don't have
to calculate f() at the sort step. Right now I don't think we can
even emit that plan shape, never mind recognize why it's cheaper.

I have only vague ideas about how to do this better. It might work
to set up multiple PathTargets for a relation that has such an
index, one for the base case where the scan only emits x and f() is
computed above, one for the case where we don't need either x or
f(x) because we're relying on the index order, and one that emits
f(x) with the expectation that a sort will happen above. Then we'd
potentially generate multiple Paths representing the very same
indexscan but with different PathTargets, and differing targets
would have to become something that add_path treats as a reason to
keep multiple Paths for the same relation. I'm a little frightened
about the possible growth in number of paths considered, but how
else would we keep track of the differing costs of these cases?

So your proposed patch is far from a final solution in this area,
and I'm not even sure that it represents a plausible incremental
step. I've got mixed feelings about the "make resjunk an enum"
idea --- it seems quite invasive and I'm not sure it's buying
enough to risk breaking non-planner code for. It might be
better to leave the targetlist representation alone and deal
with this strictly inside the planner. We could, for example,
add an array to PlannerInfo that's indexed by query-tlist resno
and indicates the reason for a particular tlist item to exist.

Also, as you mentioned, this categorization of junk columns
seems a little unprincipled. It might help to think about that
in terms similar to the "attr_needed" data we keep for Vars,
that is: how far up in the plan tree is this tlist item needed?
I think the possibilities are:

* needed for final output (ie, not resjunk)

* needed for final grouping/sorting steps (I think all
resjunk items produced by the parser are this case;
but do we need to distinguish GROUP BY from ORDER BY?)

* needed at the ModifyTable node (for rowmark columns)

* needed at the SetOp node (for flag columns added in
prepunion.c)

It looks to me like these cases are mutually exclusive, so
that we just need an enum value not a bitmask. Only
"needed for final grouping/sorting" is something we can
potentially omit from the plan.

BTW, the hack you invented JUNK_PLANNER_ONLY for, namely
that create_indexscan_plan feeds canreturn flags forward to
set_indexonlyscan_references, could be done another way:
we don't really have to overload resjunk for that. At worst,
set_indexonlyscan_references could re-look-up the IndexOptInfo
using the index OID from the plan node, and get the canreturn
flags directly from that. If we keep resjunk as a bool I'd be
inclined to leave this alone; but if we change resjunk, we
don't need the replacement design to account for this bit.

regards, tom lane

#10

Heikki Linnakangas

hlinnaka@iki.fi

about 2 years ago

In reply to: Tom Lane (#9)

Re: Avoid computing ORDER BY junk columns unnecessarily

On 29/12/2023 01:42, Tom Lane wrote:

I wrote:

Yeah, fair point. I'll try to take a look at your patchset after
the holidays.

I spent some time looking at this patch, and I'm not very pleased
with it. My basic complaint is that this is a band-aid that only
touches things at a surface level, whereas I think we need a much
deeper set of changes in order to have a plausible solution.
Specifically, you're only munging the final top-level targetlist
not anything for lower plan levels. That looks like it works in
simple cases, but it leaves everything on the table as soon as
there's more than one level of plan involved.

Yeah, that's fair.

I didn't stop to trace the details but I'm pretty sure this is why
you're getting the bogus extra projections shown in the 0005 patch.

They're not bogus. With the patches, projecting away the junk columns is
visible in the plan as an extra Result node, while currently it's done
as an implicit step in the executor. That seems fine and arguably an
even more honest representation of what's happening, although I don't
like the extra verbosity in EXPLAIN output.

Moreover, this isn't doing anything for cost estimates, which means
the planner doesn't really understand that more-desirable plans are
more desirable, and it may not pick an available plan that would
exploit what we want to have happen here.

As an example, say we have an index on f(x) and the query requires
sorting by f(x) but not final output of f(x). If we devise a plan
that uses the index to sort and doesn't emit f(x), we need to not
charge the evaluation cost of f() for that plan --- this might
make the difference between picking that plan and not. Right now
I believe we'll charge all plans for f(), so that some other plan
might look cheaper even when f() is very expensive.

Another example: we might be using an indexscan but not relying on

its sort order, for example because its output is going into a hash
join and then we'll sort afterwards. For expensive f() it would
still be useful if the index could emit f(x) so that we don't have
to calculate f() at the sort step. Right now I don't think we can
even emit that plan shape, never mind recognize why it's cheaper.

Related to this, we are not currently costing the target list evaluation
correctly for index-only scans. Here's an example:

create or replace function expensive(i int) returns int
language plpgsql as
$$
begin return i; end;
$$
immutable cost 1000000;

create table atab (i int);
insert into atab select g from generate_series(1, 10000) g;
create index on atab (i, expensive(i));

postgres=# explain verbose select expensive(i) from atab order by
expensive(i);
QUERY PLAN

----------------------------------------------------------------------------
Sort (cost=25000809.39..25000834.39 rows=10000 width=4)
Output: (expensive(i))
Sort Key: (expensive(atab.i))
-> Seq Scan on public.atab (cost=0.00..25000145.00 rows=10000 width=4)
Output: expensive(i)
(5 rows)

postgres=# set enable_seqscan=off; set enable_bitmapscan=off;
SET
SET
postgres=# explain verbose select expensive(i) from atab order by
expensive(i);
QUERY PLAN

--------------------------------------------------------------------------------------------------------------
Sort (cost=25001114.67..25001139.67 rows=10000 width=4)
Output: (expensive(i))
Sort Key: (expensive(atab.i))
-> Index Only Scan using atab_i_expensive_idx on public.atab
(cost=0.29..25000450.29 rows=10000 width=4)
Output: (expensive(i))
(5 rows)

The cost of the index only scan ought to be much lower than the seqscan,
as it can return the pre-computed expensive(i) from the index.

That could probably be fixed without any of the other changes we've been
discussing here, though.

I have only vague ideas about how to do this better. It might work
to set up multiple PathTargets for a relation that has such an
index, one for the base case where the scan only emits x and f() is
computed above, one for the case where we don't need either x or
f(x) because we're relying on the index order, and one that emits
f(x) with the expectation that a sort will happen above. Then we'd
potentially generate multiple Paths representing the very same
indexscan but with different PathTargets, and differing targets
would have to become something that add_path treats as a reason to
keep multiple Paths for the same relation. I'm a little frightened
about the possible growth in number of paths considered, but how
else would we keep track of the differing costs of these cases?

Hmm, if there are multiple functions like that in the target list, would
you need to create different paths for each combination of expressions?
That could really explode the number of paths.

Perhaps each Path could include "optional" target entries that it can
calculate more cheaply, with a separate cost for each such expression.
add_path() would treat the presence of optional target entries as a
reason to retain the path, but you wouldn't need to keep a separate path
for each PathTarget.

Another idea is to include f(x) in the PathTarget only if it's "cheap to
emit". For example, if it's an expression index on f(x). If it turns out
that f(x) is not needed higher up in the plan, it's not a big error in
the estimate because it was cheap to emit. That wouldn't work well if
the index AM could calculate f(x) more cheaply than the executor, but
the cost was still not trivial. I'm not sure if that situation can arise
currently.

* needed for final grouping/sorting steps (I think all
resjunk items produced by the parser are this case;
but do we need to distinguish GROUP BY from ORDER BY?)

It would seem useful to distinguish GROUP BY and ORDER BY. For example:

SELECT COUNT(*) FROM table GROUP BY a, b ORDER BY a;

If this is implemented as HashAgg + Sort for example, only 'a' would be
needed by the sort. Including less data in a Sort is good.

I wanted to implement that in my patch already, by removing the junk
columns needed for grouping but not sorting from sort_input_target. But
the grouping path generation has some assumptions that the grouping
output target list includes all the grouping columns. I don't remember
the exact problem that made me give up on that, but it probably could be
fixed.

--
Heikki Linnakangas
Neon (https://neon.tech)

#11

Tom Lane

tgl@sss.pgh.pa.us

about 2 years ago

In reply to: Heikki Linnakangas (#10)

Re: Avoid computing ORDER BY junk columns unnecessarily

Heikki Linnakangas <hlinnaka@iki.fi> writes:

On 29/12/2023 01:42, Tom Lane wrote:

I didn't stop to trace the details but I'm pretty sure this is why
you're getting the bogus extra projections shown in the 0005 patch.

They're not bogus. With the patches, projecting away the junk columns is
visible in the plan as an extra Result node, while currently it's done
as an implicit step in the executor. That seems fine and arguably an
even more honest representation of what's happening, although I don't
like the extra verbosity in EXPLAIN output.

I didn't bring it up in my previous message, but I'm not really on
board with trying to get rid of the executor junkfilter. It seems
to me that that code is probably faster than a general-purpose
projection, and surely faster than an extra Result node, so I think
that is not a goal that would improve matters.

However, what I *was* trying to say is that I think those projections
occur because the lower-level plan node is still outputting the
columns you want to suppress, and then the planner realizes that it
needs a projection to create the shortened tlist. But that's not
saving any work, because we still computed the expensive function :-(.
We need a more thorough treatment of the problem to ensure that the
lower-level plan nodes don't emit these columns either.

Related to this, we are not currently costing the target list evaluation
correctly for index-only scans.

Right, and we're dumb in other ways too: if the index can return
f(x) but not x, we fail to realize that we can use it for an IOS
in the first place, because there's an assumption that we'd better
be able to fetch x. Currently I think the best way around that
might be via the other discussion that's going on about unifying
regular indexscans and index-only scans [1]/messages/by-id/20230609000600.syqy447e6metnvyj@awork3.anarazel.de. If we do that then
we could postpone the decision about whether we actually need x
itself, and perhaps that would simplify getting this right.

I'm kind of inclined to think that it'd be best to push the
other discussion forward first, and come back to this area
when it's done, because that will be touching a lot of the
same code as well as (maybe) simplifying the planner's problem.

regards, tom lane

[1]: /messages/by-id/20230609000600.syqy447e6metnvyj@awork3.anarazel.de

#12

Peter Smith

smithpb2250@gmail.com

almost 2 years ago

In reply to: Heikki Linnakangas (#1)

Re: Avoid computing ORDER BY junk columns unnecessarily

2024-01 Commitfest.

Hi, This patch has a CF status of "Needs Review", but it seems like
there were some CFbot test failures last time it was run [1]https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/46/4717. Please
have a look and post an updated version if necessary.

======
[1]: https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/46/4717