Properly pathify the union planner

Started by David Rowleyabout 2 years ago27 messages

dgrowleyml@gmail.com

about 2 years ago

1 attachment(s)

The upper planner was pathified many years ago now. That was a large
chunk of work and because of that, the union planner was not properly
pathified in that effort. A small note was left in
recurse_set_operations() to mention about future work.

You can see this lack of pathification in make_union_unique() and
choose_hashed_setop(). There are heuristics in there to decide the
method to use instead of creating paths and letting add_path() decide
what's faster.

I've been working on improving this for the past few weeks and I'm not
quite as far along as I'd hoped, but what I have is perhaps worthy of
sharing. For now, I've only improved UNIONs.

A UNION plan can now look like:

# explain (costs off) select * from a union select * from a;
QUERY PLAN
---------------------------------------------------
Unique
-> Merge Append
Sort Key: a.a
-> Index Only Scan using a_pkey on a
-> Index Only Scan using a_pkey on a a_1

Previously we'd have only considered Append -> Hash Aggregate or via
Append -> Sort -> Unique

To make this work, the query_planner() needs to know about setops, so
I've passed those down via the standard_qp_extra struct so that we can
choose pathkeys for the setops.

One part that still needs work is the EquivalanceClass building.
Because we only build the final targetlist for the Append after
planning all the append child queries, I ended up having to populate
the EquivalanceClasses backwards, i.e children first. add_eq_member()
determines if you're passing a child member by checking if parent !=
NULL. Since I don't have a parent EquivalenceMember to pass,
em_is_child gets set wrongly, and that causes problems because
ec_has_const can get set to true when it shouldn't. This is a problem
as it can make a PathKey redundant when it isn't. I wonder if I'll
need to change the signature of add_eq_member() and add an "is_child"
bool to force the EM to be a child em... Needs more thought...

I've not worked on the creation of Incremental Sort paths yet, or done
any path plumbing work to have UNION consider Gather Merge -> Unique
on already sorted paths. I think to make similar improvements to
EXCEPT and INTERSECT we'd need a node executor node. Perhaps
nodeMergeAppendSetops.c which can be configured to do EXCEPT or
INTERSECT. It could also perhaps handle UNION too then we can use
that instead of a Merge Append -> Unique. That might save doing some
slot copying and improve performance further. I'm not planning on
doing that for the first stage. I only intend to improve UNION for
that and we have all the executor nodes to make that work already.

Anyway, I've attached my WIP patch for this.

David

Attachments:

better_union_planner_v0.patchtext/plain; charset=US-ASCII; name=better_union_planner_v0.patchDownload

diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 7fa502d6e2..54ddcf8285 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -2599,6 +2599,44 @@ find_derived_clause_for_ec_member(EquivalenceClass *ec,
 	return NULL;
 }
 
+/*
+ * add_union_rel_equivalences
+ */
+void
+add_union_rel_equivalences(PlannerInfo *root, Relids relids,
+						   List *tlist, List *union_pathkeys)
+{
+	ListCell   *lc;
+	ListCell   *lc2 = list_head(union_pathkeys);
+
+	foreach(lc, tlist)
+	{
+		TargetEntry *tle = lfirst_node(TargetEntry, lc);
+		PathKey    *pk;
+
+		if (tle->resjunk)
+			continue;
+
+		if (lc2 == NULL)
+			elog(ERROR, "wrong number of union pathkeys");
+
+		pk = lfirst_node(PathKey, lc2);
+
+		/*
+		 * XXX this needs fixed.  The children are added before the parents,
+		 * so we cannot identify the parent.
+		 */
+		add_eq_member(pk->pk_eclass,
+					  tle->expr,
+					  relids,
+					  NULL,
+					  NULL,
+					  exprType((Node *) tle->expr));
+		pk->pk_eclass->ec_has_const = false;	/* XXX hack hack */
+
+		lc2 = lnext(union_pathkeys, lc2);
+	}
+}
 
 /*
  * add_child_rel_equivalences
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index fdb60aaa8d..00f2071cf0 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -786,6 +786,59 @@ build_partition_pathkeys(PlannerInfo *root, RelOptInfo *partrel,
 	return retval;
 }
 
+/*
+ * build_union_child_pathkeys
+ */
+List *
+build_union_child_pathkeys(PlannerInfo *root, SetOperationStmt *op,
+						   RelOptInfo *rel, List *tlist)
+{
+	ListCell   *lc;
+	ListCell   *sgccell = list_head(op->groupClauses);
+	List	   *retval = NIL;
+
+	foreach(lc, tlist)
+	{
+		TargetEntry *tle = lfirst_node(TargetEntry, lc);
+		SortGroupClause *sgc;
+
+		Oid			opfamily;
+		Oid			opcintype;
+		int16		strategy;
+		PathKey    *cpathkey;
+
+		if (tle->resjunk)
+			continue;
+
+		if (sgccell == NULL)
+			elog(ERROR, "wrong number of group clauses");	/* XXX write better
+															 * error */
+
+		sgc = lfirst_node(SortGroupClause, sgccell);
+
+		/* Find the operator in pg_amop --- failure shouldn't happen */
+		if (!get_ordering_op_properties(sgc->sortop,
+										&opfamily, &opcintype, &strategy))
+			elog(ERROR, "operator %u is not a valid ordering operator",
+				 sgc->eqop);
+
+		cpathkey = make_pathkey_from_sortinfo(root,
+											  tle->expr,
+											  opfamily,
+											  opcintype,
+											  exprCollation((Node *) tle->expr),
+											  false,
+											  sgc->nulls_first,
+											  0,
+											  rel->relids,
+											  true);
+		retval = lappend(retval, cpathkey);
+		sgccell = lnext(op->groupClauses, sgccell);
+	}
+
+	return retval;
+}
+
 /*
  * build_expression_pathkey
  *	  Build a pathkeys list that describes an ordering by a single expression
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index a8cea5efe1..c5bb4e6c60 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -126,6 +126,8 @@ typedef struct
 {
 	List	   *activeWindows;	/* active windows, if any */
 	grouping_sets_data *gset_data;	/* grouping sets data, if any */
+	SetOperationStmt *setops;	/* set operation stmt details or NULL if not a
+								 * set operation */
 } standard_qp_extra;
 
 /* Local functions */
@@ -1504,6 +1506,13 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 		/* Set up data needed by standard_qp_callback */
 		qp_extra.activeWindows = activeWindows;
 		qp_extra.gset_data = gset_data;
+		if (root->parent_root != NULL &&
+			root->parent_root->parse->setOperations != NULL &&
+			IsA(root->parent_root->parse->setOperations, SetOperationStmt))
+			qp_extra.setops =
+				(SetOperationStmt *) root->parent_root->parse->setOperations;
+		else
+			qp_extra.setops = NULL;
 
 		/*
 		 * Generate the best unsorted and presorted paths for the scan/join
@@ -3551,6 +3560,32 @@ standard_qp_callback(PlannerInfo *root, void *extra)
 		root->query_pathkeys = root->distinct_pathkeys;
 	else if (root->sort_pathkeys)
 		root->query_pathkeys = root->sort_pathkeys;
+	else if (qp_extra->setops != NULL &&
+			 qp_extra->setops->op == SETOP_UNION &&
+			 !qp_extra->setops->all)
+	{
+		List	   *groupClauses;
+		ListCell   *lc;
+
+		foreach(lc, root->processed_tlist)
+		{
+			TargetEntry *tle = lfirst_node(TargetEntry, lc);
+
+			if (tle->resjunk)
+				continue;
+
+			tle->ressortgroupref = tle->resno;
+		}
+
+		groupClauses = generate_setop_grouplist(qp_extra->setops, tlist);
+
+		if (grouping_is_sortable(groupClauses))
+			root->query_pathkeys = make_pathkeys_for_sortclauses(root,
+																 groupClauses,
+																 tlist);
+		else
+			root->query_pathkeys = NIL;
+	}
 	else
 		root->query_pathkeys = NIL;
 }
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 0c68ec011b..ac89c70c50 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -47,10 +47,12 @@
 
 
 static RelOptInfo *recurse_set_operations(Node *setOp, PlannerInfo *root,
+										  SetOperationStmt *top_union,
 										  List *colTypes, List *colCollations,
 										  bool junkOK,
 										  int flag, List *refnames_tlist,
 										  List **pTargetList,
+										  List **first_child_pathkeys,
 										  double *pNumGroups);
 static RelOptInfo *generate_recursion_path(SetOperationStmt *setOp,
 										   PlannerInfo *root,
@@ -65,9 +67,8 @@ static RelOptInfo *generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *ro
 static List *plan_union_children(PlannerInfo *root,
 								 SetOperationStmt *top_union,
 								 List *refnames_tlist,
-								 List **tlist_list);
-static Path *make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
-							   PlannerInfo *root);
+								 List **tlist_list,
+								 List **first_child_pathkeys);
 static void postprocess_setop_rel(PlannerInfo *root, RelOptInfo *rel);
 static bool choose_hashed_setop(PlannerInfo *root, List *groupClauses,
 								Path *input_path,
@@ -84,7 +85,6 @@ static List *generate_append_tlist(List *colTypes, List *colCollations,
 								   bool flag,
 								   List *input_tlists,
 								   List *refnames_tlist);
-static List *generate_setop_grouplist(SetOperationStmt *op, List *targetlist);
 
 
 /*
@@ -166,11 +166,12 @@ plan_set_operations(PlannerInfo *root)
 		 * output from the top-level node, plus possibly resjunk working
 		 * columns (we can rely on upper-level nodes to deal with that).
 		 */
-		setop_rel = recurse_set_operations((Node *) topop, root,
+		setop_rel = recurse_set_operations((Node *) topop, root, topop,
 										   topop->colTypes, topop->colCollations,
 										   true, -1,
 										   leftmostQuery->targetList,
 										   &top_tlist,
+										   NULL,
 										   NULL);
 	}
 
@@ -206,10 +207,12 @@ plan_set_operations(PlannerInfo *root)
  */
 static RelOptInfo *
 recurse_set_operations(Node *setOp, PlannerInfo *root,
+					   SetOperationStmt *topop,
 					   List *colTypes, List *colCollations,
 					   bool junkOK,
 					   int flag, List *refnames_tlist,
 					   List **pTargetList,
+					   List **first_child_pathkeys,
 					   double *pNumGroups)
 {
 	RelOptInfo *rel = NULL;		/* keep compiler quiet */
@@ -224,10 +227,10 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		Query	   *subquery = rte->subquery;
 		PlannerInfo *subroot;
 		RelOptInfo *final_rel;
-		Path	   *subpath;
-		Path	   *path;
 		List	   *tlist;
+		List	   *child_pathkeys;
 		bool		trivial_tlist;
+		ListCell   *lc;
 
 		Assert(subquery != NULL);
 
@@ -263,6 +266,26 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		/* Return the fully-fledged tlist to caller, too */
 		*pTargetList = tlist;
 
+		if (!topop->all && first_child_pathkeys != NULL &&
+			grouping_is_sortable(topop->groupClauses))
+		{
+			if (*first_child_pathkeys == NIL)
+			{
+				child_pathkeys = build_union_child_pathkeys(root,
+															topop,
+															rel,
+															tlist);
+				*first_child_pathkeys = child_pathkeys;
+			}
+			else
+			{
+				add_union_rel_equivalences(root,
+										   rel->relids,
+										   tlist,
+										   *first_child_pathkeys);
+			}
+		}
+
 		/*
 		 * Mark rel with estimated output rows, width, etc.  Note that we have
 		 * to do this before generating outer-query paths, else
@@ -278,44 +301,61 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		rel->consider_parallel = final_rel->consider_parallel;
 
 		/*
-		 * For the moment, we consider only a single Path for the subquery.
-		 * This should change soon (make it look more like
-		 * set_subquery_pathlist).
-		 */
-		subpath = get_cheapest_fractional_path(final_rel,
-											   root->tuple_fraction);
-
-		/*
-		 * Stick a SubqueryScanPath atop that.
-		 *
-		 * We don't bother to determine the subquery's output ordering since
-		 * it won't be reflected in the set-op result anyhow; so just label
-		 * the SubqueryScanPath with nil pathkeys.  (XXX that should change
-		 * soon too, likely.)
+		 * For each Path that subquery_planner produced, make a
+		 * SubqueryScanPath in the outer query.
 		 */
-		path = (Path *) create_subqueryscan_path(root, rel, subpath,
-												 trivial_tlist,
-												 NIL, NULL);
-
-		add_path(rel, path);
+		foreach(lc, final_rel->pathlist)
+		{
+			Path	   *subpath = (Path *) lfirst(lc);
+			List	   *pathkeys;
+
+			/* Convert subpath's pathkeys to outer representation */
+			pathkeys = convert_subquery_pathkeys(root,
+												 rel,
+												 subpath->pathkeys,
+												 make_tlist_from_pathtarget(subpath->pathtarget));
+
+			/* Generate outer path using this subpath */
+			add_path(rel,
+					 (Path *) create_subqueryscan_path(root,
+													   rel,
+													   subpath,
+													   trivial_tlist,
+													   pathkeys,
+													   NULL));
+		}
 
-		/*
-		 * If we have a partial path for the child relation, we can use that
-		 * to build a partial path for this relation.  But there's no point in
-		 * considering any path but the cheapest.
-		 */
-		if (rel->consider_parallel && bms_is_empty(rel->lateral_relids) &&
-			final_rel->partial_pathlist != NIL)
+		/* If outer rel allows parallelism, do same for partial paths. */
+		if (rel->consider_parallel && bms_is_empty(rel->lateral_relids))
 		{
-			Path	   *partial_subpath;
-			Path	   *partial_path;
-
-			partial_subpath = linitial(final_rel->partial_pathlist);
-			partial_path = (Path *)
-				create_subqueryscan_path(root, rel, partial_subpath,
-										 trivial_tlist,
-										 NIL, NULL);
-			add_partial_path(rel, partial_path);
+			/*
+			 * If consider_parallel is false, there should be no partial
+			 * paths.
+			 */
+			Assert(final_rel->consider_parallel ||
+				   final_rel->partial_pathlist == NIL);
+
+			/* Same for partial paths. */
+			foreach(lc, final_rel->partial_pathlist)
+			{
+				Path	   *subpath = (Path *) lfirst(lc);
+				List	   *pathkeys;
+
+				/* Convert subpath's pathkeys to outer representation */
+				pathkeys = convert_subquery_pathkeys(root,
+													 rel,
+													 subpath->pathkeys,
+													 make_tlist_from_pathtarget(subpath->pathtarget));
+
+				/* Generate outer path using this subpath */
+				add_partial_path(rel, (Path *)
+								 create_subqueryscan_path(root,
+														  rel,
+														  subpath,
+														  trivial_tlist,
+														  pathkeys,
+														  NULL));
+			}
 		}
 
 		/*
@@ -338,11 +378,11 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 			if (subquery->groupClause || subquery->groupingSets ||
 				subquery->distinctClause ||
 				subroot->hasHavingQual || subquery->hasAggs)
-				*pNumGroups = subpath->rows;
+				*pNumGroups = final_rel->rows;
 			else
 				*pNumGroups = estimate_num_groups(subroot,
 												  get_tlist_exprs(subquery->targetList, false),
-												  subpath->rows,
+												  final_rel->rows,
 												  NULL,
 												  NULL);
 		}
@@ -464,20 +504,22 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 	 * Unlike a regular UNION node, process the left and right inputs
 	 * separately without any intention of combining them into one Append.
 	 */
-	lrel = recurse_set_operations(setOp->larg, root,
+	lrel = recurse_set_operations(setOp->larg, root, setOp,
 								  setOp->colTypes, setOp->colCollations,
 								  false, -1,
 								  refnames_tlist,
 								  &lpath_tlist,
+								  NULL,
 								  NULL);
 	lpath = lrel->cheapest_total_path;
 	/* The right path will want to look at the left one ... */
 	root->non_recursive_path = lpath;
-	rrel = recurse_set_operations(setOp->rarg, root,
+	rrel = recurse_set_operations(setOp->rarg, root, setOp,
 								  setOp->colTypes, setOp->colCollations,
 								  false, -1,
 								  refnames_tlist,
 								  &rpath_tlist,
+								  NULL,
 								  NULL);
 	rpath = rrel->cheapest_total_path;
 	root->non_recursive_path = NULL;
@@ -553,13 +595,18 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	double		save_fraction = root->tuple_fraction;
 	ListCell   *lc;
 	List	   *pathlist = NIL;
+	List	   *ordered_pathlist = NIL;
 	List	   *partial_pathlist = NIL;
 	bool		partial_paths_valid = true;
 	bool		consider_parallel = true;
 	List	   *rellist;
 	List	   *tlist_list;
 	List	   *tlist;
-	Path	   *path;
+	List	   *first_child_pathkeys = NIL;
+	List	   *groupList = NIL;
+	Path	   *apath;
+	Path	   *gpath = NULL;
+	bool		try_sorted;
 
 	/*
 	 * If plain UNION, tell children to fetch all tuples.
@@ -581,7 +628,11 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	 * only one Append and unique-ification for the lot.  Recurse to find such
 	 * nodes and compute their children's paths.
 	 */
-	rellist = plan_union_children(root, op, refnames_tlist, &tlist_list);
+	rellist = plan_union_children(root,
+								  op,
+								  refnames_tlist,
+								  &tlist_list,
+								  &first_child_pathkeys);
 
 	/*
 	 * Generate tlist for Append plan node.
@@ -593,15 +644,47 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	tlist = generate_append_tlist(op->colTypes, op->colCollations, false,
 								  tlist_list, refnames_tlist);
 
+	/* For for UNIONs (not UNION ALL), try sorting, if sorting is possible */
+	try_sorted = !op->all && grouping_is_sortable(op->groupClauses);
+
+	if (try_sorted)
+	{
+		/* add equivalence members for top-level target list */
+		add_union_rel_equivalences(root,
+								   bms_make_singleton(0),
+								   tlist,
+								   first_child_pathkeys);
+		groupList = generate_setop_grouplist(op, tlist);
+	}
+
 	*pTargetList = tlist;
 
 	/* Build path lists and relid set. */
 	foreach(lc, rellist)
 	{
 		RelOptInfo *rel = lfirst(lc);
+		Path	   *ordered_path;
 
 		pathlist = lappend(pathlist, rel->cheapest_total_path);
 
+		if (try_sorted)
+		{
+			ordered_path = get_cheapest_path_for_pathkeys(rel->pathlist,
+														  first_child_pathkeys,
+														  NULL,
+														  TOTAL_COST,
+														  true);
+
+			/*
+			 * XXX should we sort the cheapest path if we can't find a path
+			 * that's correctly ordered?
+			 */
+			if (ordered_path != NULL)
+				ordered_pathlist = lappend(ordered_pathlist, ordered_path);
+			else
+				try_sorted = false;
+		}
+
 		if (consider_parallel)
 		{
 			if (!rel->consider_parallel)
@@ -627,24 +710,15 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	/*
 	 * Append the child results together.
 	 */
-	path = (Path *) create_append_path(root, result_rel, pathlist, NIL,
-									   NIL, NULL, 0, false, -1);
-
-	/*
-	 * For UNION ALL, we just need the Append path.  For UNION, need to add
-	 * node(s) to remove duplicates.
-	 */
-	if (!op->all)
-		path = make_union_unique(op, path, tlist, root);
-
-	add_path(result_rel, path);
+	apath = (Path *) create_append_path(root, result_rel, pathlist, NIL,
+										NIL, NULL, 0, false, -1);
 
 	/*
 	 * Estimate number of groups.  For now we just assume the output is unique
 	 * --- this is certainly true for the UNION case, and we want worst-case
 	 * estimates anyway.
 	 */
-	result_rel->rows = path->rows;
+	result_rel->rows = apath->rows;
 
 	/*
 	 * Now consider doing the same thing using the partial paths plus Append
@@ -652,7 +726,7 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	 */
 	if (partial_paths_valid)
 	{
-		Path	   *ppath;
+		Path	   *papath;
 		int			parallel_workers = 0;
 
 		/* Find the highest number of workers requested for any subpath. */
@@ -681,19 +755,135 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 		}
 		Assert(parallel_workers > 0);
 
-		ppath = (Path *)
+		papath = (Path *)
 			create_append_path(root, result_rel, NIL, partial_pathlist,
-							   NIL, NULL,
-							   parallel_workers, enable_parallel_append,
-							   -1);
-		ppath = (Path *)
-			create_gather_path(root, result_rel, ppath,
-							   result_rel->reltarget, NULL, NULL);
-		if (!op->all)
-			ppath = make_union_unique(op, ppath, tlist, root);
-		add_path(result_rel, ppath);
+							   NIL, NULL, parallel_workers,
+							   enable_parallel_append, -1);
+		gpath = (Path *)
+			create_gather_path(root, result_rel,
+							   papath, result_rel->reltarget, NULL, NULL);
 	}
 
+	if (!op->all)
+	{
+		double		dNumGroups;
+		bool		can_sort = grouping_is_sortable(groupList);
+		bool		can_hash = grouping_is_hashable(groupList);
+
+		/*
+		 * XXX for the moment, take the number of distinct groups as equal to
+		 * the total input size, ie, the worst case.  This is too
+		 * conservative, but it's not clear how to get a decent estimate of
+		 * the true size.  One should note as well the propensity of novices
+		 * to write UNION rather than UNION ALL even when they don't expect
+		 * any duplicates...
+		 */
+		dNumGroups = apath->rows;
+
+		if (can_hash)
+		{
+			Path	   *path;
+
+			/* Hashed aggregate plan --- no sort needed */
+			path = (Path *) create_agg_path(root,
+											result_rel,
+											apath,
+											create_pathtarget(root, tlist),
+											AGG_HASHED,
+											AGGSPLIT_SIMPLE,
+											groupList,
+											NIL,
+											NULL,
+											dNumGroups);
+			add_path(result_rel, path);
+
+			/* Try hash aggregate on the Gather path, if valid */
+			if (gpath != NULL)
+			{
+				/* Hashed aggregate plan --- no sort needed */
+				path = (Path *) create_agg_path(root,
+												result_rel,
+												gpath,
+												create_pathtarget(root, tlist),
+												AGG_HASHED,
+												AGGSPLIT_SIMPLE,
+												groupList,
+												NIL,
+												NULL,
+												dNumGroups);
+				add_path(result_rel, path);
+			}
+		}
+
+		if (can_sort)
+		{
+			Path	   *path = apath;
+
+			if (groupList != NIL)
+				path = (Path *) create_sort_path(root, result_rel, path,
+												 make_pathkeys_for_sortclauses(root, groupList, tlist),
+												 -1.0);
+
+			path = (Path *) create_upper_unique_path(root,
+													 result_rel,
+													 path,
+													 list_length(path->pathkeys),
+													 dNumGroups);
+
+			add_path(result_rel, path);
+
+			/* Try Sort -> Unique on the Gather path, if set */
+			if (gpath != NULL)
+			{
+				path = gpath;
+
+				path = (Path *) create_sort_path(root, result_rel, path,
+												 make_pathkeys_for_sortclauses(root, groupList, tlist),
+												 -1.0);
+
+				path = (Path *) create_upper_unique_path(root,
+														 result_rel,
+														 path,
+														 list_length(path->pathkeys),
+														 dNumGroups);
+				add_path(result_rel, path);
+			}
+		}
+
+		/*
+		 * Try making a MergeAppend path if we managed to find a path with the
+		 * correct pathkeys for each union child.
+		 */
+		if (try_sorted && groupList != NIL)
+		{
+			Path	   *path;
+
+			path = (Path *) create_merge_append_path(root,
+													 result_rel,
+													 ordered_pathlist,
+													 first_child_pathkeys,
+													 NULL);
+
+			/* and make the MergeAppend unique */
+			path = (Path *) create_upper_unique_path(root,
+													 result_rel,
+													 path,
+													 list_length(tlist),
+													 path->rows);
+
+			add_path(result_rel, path);
+		}
+	}
+	else
+	{
+		/* UNION ALL */
+		add_path(result_rel, apath);
+
+		if (gpath != NULL)
+			add_path(result_rel, gpath);
+	}
+
+
 	/* Undo effects of possibly forcing tuple_fraction to 0 */
 	root->tuple_fraction = save_fraction;
 
@@ -735,18 +925,20 @@ generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *root,
 	root->tuple_fraction = 0.0;
 
 	/* Recurse on children, ensuring their outputs are marked */
-	lrel = recurse_set_operations(op->larg, root,
+	lrel = recurse_set_operations(op->larg, root, op,
 								  op->colTypes, op->colCollations,
 								  false, 0,
 								  refnames_tlist,
 								  &lpath_tlist,
+								  NULL,
 								  &dLeftGroups);
 	lpath = lrel->cheapest_total_path;
-	rrel = recurse_set_operations(op->rarg, root,
+	rrel = recurse_set_operations(op->rarg, root, op,
 								  op->colTypes, op->colCollations,
 								  false, 1,
 								  refnames_tlist,
 								  &rpath_tlist,
+								  NULL,
 								  &dRightGroups);
 	rpath = rrel->cheapest_total_path;
 
@@ -884,7 +1076,8 @@ static List *
 plan_union_children(PlannerInfo *root,
 					SetOperationStmt *top_union,
 					List *refnames_tlist,
-					List **tlist_list)
+					List **tlist_list,
+					List **first_child_pathkeys)
 {
 	List	   *pending_rels = list_make1(top_union);
 	List	   *result = NIL;
@@ -923,12 +1116,13 @@ plan_union_children(PlannerInfo *root,
 		 * we have an EXCEPT or INTERSECT as child, else there won't be
 		 * resjunk anyway.
 		 */
-		result = lappend(result, recurse_set_operations(setOp, root,
+		result = lappend(result, recurse_set_operations(setOp, root, top_union,
 														top_union->colTypes,
 														top_union->colCollations,
 														false, -1,
 														refnames_tlist,
 														&child_tlist,
+														first_child_pathkeys,
 														NULL));
 		*tlist_list = lappend(*tlist_list, child_tlist);
 	}
@@ -936,68 +1130,6 @@ plan_union_children(PlannerInfo *root,
 	return result;
 }
 
-/*
- * Add nodes to the given path tree to unique-ify the result of a UNION.
- */
-static Path *
-make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
-				  PlannerInfo *root)
-{
-	RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
-	List	   *groupList;
-	double		dNumGroups;
-
-	/* Identify the grouping semantics */
-	groupList = generate_setop_grouplist(op, tlist);
-
-	/*
-	 * XXX for the moment, take the number of distinct groups as equal to the
-	 * total input size, ie, the worst case.  This is too conservative, but
-	 * it's not clear how to get a decent estimate of the true size.  One
-	 * should note as well the propensity of novices to write UNION rather
-	 * than UNION ALL even when they don't expect any duplicates...
-	 */
-	dNumGroups = path->rows;
-
-	/* Decide whether to hash or sort */
-	if (choose_hashed_setop(root, groupList, path,
-							dNumGroups, dNumGroups,
-							"UNION"))
-	{
-		/* Hashed aggregate plan --- no sort needed */
-		path = (Path *) create_agg_path(root,
-										result_rel,
-										path,
-										create_pathtarget(root, tlist),
-										AGG_HASHED,
-										AGGSPLIT_SIMPLE,
-										groupList,
-										NIL,
-										NULL,
-										dNumGroups);
-	}
-	else
-	{
-		/* Sort and Unique */
-		if (groupList)
-			path = (Path *)
-				create_sort_path(root,
-								 result_rel,
-								 path,
-								 make_pathkeys_for_sortclauses(root,
-															   groupList,
-															   tlist),
-								 -1.0);
-		path = (Path *) create_upper_unique_path(root,
-												 result_rel,
-												 path,
-												 list_length(path->pathkeys),
-												 dNumGroups);
-	}
-
-	return path;
-}
-
 /*
  * postprocess_setop_rel - perform steps required after adding paths
  */
@@ -1403,7 +1535,7 @@ generate_append_tlist(List *colTypes, List *colCollations,
  * setop.  So what we need to do here is copy that list and install
  * proper sortgrouprefs into it (copying those from the targetlist).
  */
-static List *
+List *
 generate_setop_grouplist(SetOperationStmt *op, List *targetlist)
 {
 	List	   *grouplist = copyObject(op->groupClauses);
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 9e7408c7ec..b945c58a35 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -164,6 +164,9 @@ extern EquivalenceClass *match_eclasses_to_foreign_key_col(PlannerInfo *root,
 														   int colno);
 extern RestrictInfo *find_derived_clause_for_ec_member(EquivalenceClass *ec,
 													   EquivalenceMember *em);
+extern void add_union_rel_equivalences(PlannerInfo *root, Relids relids,
+									   List *tlist,
+									   List *union_pathkeys);
 extern void add_child_rel_equivalences(PlannerInfo *root,
 									   AppendRelInfo *appinfo,
 									   RelOptInfo *parent_rel,
@@ -217,6 +220,8 @@ extern List *build_index_pathkeys(PlannerInfo *root, IndexOptInfo *index,
 								  ScanDirection scandir);
 extern List *build_partition_pathkeys(PlannerInfo *root, RelOptInfo *partrel,
 									  ScanDirection scandir, bool *partialkeys);
+extern List *build_union_child_pathkeys(PlannerInfo *root, SetOperationStmt *op,
+										RelOptInfo *rel, List *tlist);
 extern List *build_expression_pathkey(PlannerInfo *root, Expr *expr,
 									  Oid opno,
 									  Relids rel, bool create_it);
diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h
index 54fd61c9c3..3b7641f2b6 100644
--- a/src/include/optimizer/prep.h
+++ b/src/include/optimizer/prep.h
@@ -53,6 +53,6 @@ extern void preprocess_aggrefs(PlannerInfo *root, Node *clause);
  * prototypes for prepunion.c
  */
 extern RelOptInfo *plan_set_operations(PlannerInfo *root);
-
+extern List *generate_setop_grouplist(SetOperationStmt *op, List *targetlist);
 
 #endif							/* PREP_H */
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 64cebe4833..04492c961e 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -370,16 +370,16 @@ select count(*) from
 explain (costs off)
 select count(*) from
   ( select unique1 from tenk1 intersect select fivethous from tenk1 ) ss;
-                                     QUERY PLAN                                     
-------------------------------------------------------------------------------------
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
  Aggregate
    ->  Subquery Scan on ss
          ->  HashSetOp Intersect
                ->  Append
-                     ->  Subquery Scan on "*SELECT* 2"
-                           ->  Seq Scan on tenk1
                      ->  Subquery Scan on "*SELECT* 1"
-                           ->  Index Only Scan using tenk1_unique1 on tenk1 tenk1_1
+                           ->  Index Only Scan using tenk1_unique1 on tenk1
+                     ->  Subquery Scan on "*SELECT* 2"
+                           ->  Seq Scan on tenk1 tenk1_1
 (8 rows)
 
 select count(*) from
@@ -433,18 +433,18 @@ select count(*) from
 explain (costs off)
 select count(*) from
   ( select unique1 from tenk1 intersect select fivethous from tenk1 ) ss;
-                                        QUERY PLAN                                        
-------------------------------------------------------------------------------------------
+                                    QUERY PLAN                                    
+----------------------------------------------------------------------------------
  Aggregate
    ->  Subquery Scan on ss
          ->  SetOp Intersect
                ->  Sort
-                     Sort Key: "*SELECT* 2".fivethous
+                     Sort Key: "*SELECT* 1".unique1
                      ->  Append
-                           ->  Subquery Scan on "*SELECT* 2"
-                                 ->  Seq Scan on tenk1
                            ->  Subquery Scan on "*SELECT* 1"
-                                 ->  Index Only Scan using tenk1_unique1 on tenk1 tenk1_1
+                                 ->  Index Only Scan using tenk1_unique1 on tenk1
+                           ->  Subquery Scan on "*SELECT* 2"
+                                 ->  Seq Scan on tenk1 tenk1_1
 (10 rows)
 
 select count(*) from
@@ -954,7 +954,7 @@ explain (costs off)
 select from generate_series(1,5) union select from generate_series(1,3);
                            QUERY PLAN                           
 ----------------------------------------------------------------
- HashAggregate
+ Unique
    ->  Append
          ->  Function Scan on generate_series
          ->  Function Scan on generate_series generate_series_1
@@ -1102,16 +1102,17 @@ explain (costs off)
   UNION
   SELECT * FROM t2) t
  WHERE ab = 'ab';
-                    QUERY PLAN                     
----------------------------------------------------
- HashAggregate
-   Group Key: ((t1.a || t1.b))
-   ->  Append
-         ->  Index Scan using t1_ab_idx on t1
-               Index Cond: ((a || b) = 'ab'::text)
-         ->  Index Only Scan using t2_pkey on t2
-               Index Cond: (ab = 'ab'::text)
-(7 rows)
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: ((t1.a || t1.b))
+         ->  Append
+               ->  Index Scan using t1_ab_idx on t1
+                     Index Cond: ((a || b) = 'ab'::text)
+               ->  Index Only Scan using t2_pkey on t2
+                     Index Cond: (ab = 'ab'::text)
+(8 rows)
 
 --
 -- Test that ORDER BY for UNION ALL can be pushed down to inheritance

David Rowley

dgrowleyml@gmail.com

about 2 years ago

In reply to: David Rowley (#1)

1 attachment(s)

Re: Properly pathify the union planner

On Thu, 2 Nov 2023 at 12:42, David Rowley <dgrowleyml@gmail.com> wrote:

One part that still needs work is the EquivalanceClass building.
Because we only build the final targetlist for the Append after
planning all the append child queries, I ended up having to populate
the EquivalanceClasses backwards, i.e children first. add_eq_member()
determines if you're passing a child member by checking if parent !=
NULL. Since I don't have a parent EquivalenceMember to pass,
em_is_child gets set wrongly, and that causes problems because
ec_has_const can get set to true when it shouldn't. This is a problem
as it can make a PathKey redundant when it isn't. I wonder if I'll
need to change the signature of add_eq_member() and add an "is_child"
bool to force the EM to be a child em... Needs more thought...

I've spent more time working on this and I ended up solving the above
problem by delaying the subquery path creation on the union children
until after we've built the top-level targetlist. This allows the
parent eclasses to be correctly added before adding members for the
union children. (See build_setop_child_paths() in the patch)

There's been quite a bit of progress in other areas too. Incremental
sorts now work:

# create table t1(a int primary key, b int not null);
# create table t2(a int primary key, b int not null);
# insert into t1 select x,x from generate_Series(1,1000000)x;
# insert into t2 select x,x from generate_Series(1,1000000)x;
# vacuum analyze t1,t2;

# explain (costs off) select * from t1 union select * from t2;
QUERY PLAN
--------------------------------------------------
Unique
-> Merge Append
Sort Key: t1.a, t1.b
-> Incremental Sort
Sort Key: t1.a, t1.b
Presorted Key: t1.a
-> Index Scan using t1_pkey on t1
-> Incremental Sort
Sort Key: t2.a, t2.b
Presorted Key: t2.a
-> Index Scan using t2_pkey on t2
(11 rows)

However, I've not yet made the MergeAppend UNIONs work when the
datatypes don't match on either side of the UNION. For now, the
reason this does not work is due to convert_subquery_pathkeys() being
unable to find the pathkey targets in the targetlist. The actual
targets can't be found due to the typecast. I wondered if this could
be fixed by adding an additional projection path to the subquery when
the output columns don't match the setop->colTypes, but I'm a bit put
off by the comment in transformSetOperationTree:

* For all non-UNKNOWN-type cases, we verify coercibility but we
* don't modify the child's expression, for fear of changing the
* child query's semantics.

I assume that's worried about the semantics of things like WHERE
clauses, so maybe the projection path in the subquery would be ok. I
need to spend more time on that.

Another problem I hit was add_path() pfreeing a Path that I needed.
This was happening due to how I'm building the final paths in the
subquery when setop_pathkeys are set. Because I want to always
include the cheapest_input_path to allow that path to be used in
hash-based UNIONs, I also want to provide sorted paths so that
MergeAppend has something to work with. I found cases where I'd
add_path() the cheapest_input_path to the final rel then also attempt
to sort that path. Unfortunately, add_path() found the unsorted path
and the sorted path fuzzily the same cost and opted to keep the sorted
one due to it having better pathkeys. add_path() then pfree'd the
cheapest_input_path which meant the Sort's subpath was gone which
obviously caused issues in createplan.c.

For now, as a temporary fix, I've just #ifdef'd out the code in
add_path() that's pfreeing the old path. I have drafted a patch that
refcounts Paths, but I'm unsure if that's the correct solution as I'm
only maintaining the refcounts in add_path() and add_partial_path(). I
think a true correct solution would bump the refcount when a path is
used as some other path's subpath. That would mean having to
recursively pfree paths up until we find one with a refcount>0. Seems
a bit expensive for add_path() to do.

I've attached the updated patch. This one is probably ready for
someone to test out. There will be more work to do, however.

David

Attachments:

better_union_planner_v1.patchtext/plain; charset=US-ASCII; name=better_union_planner_v1.patchDownload

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 22cae37a1e..51e6dd028c 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -11194,6 +11194,11 @@ DROP INDEX base_tbl1_idx;
 DROP INDEX base_tbl2_idx;
 DROP INDEX async_p3_idx;
 -- UNION queries
+SET enable_sort TO off;
+SET enable_incremental_sort TO off;
+-- Adjust fdw_startup_cost so that we get an unordered path in the Append.
+-- XXX or is it better to just accept the plan change?
+ALTER SERVER loopback2 OPTIONS (ADD fdw_startup_cost '0.00');
 EXPLAIN (VERBOSE, COSTS OFF)
 INSERT INTO result_tbl
 (SELECT a, b, 'AAA' || c FROM async_p1 ORDER BY a LIMIT 10)
@@ -11275,6 +11280,9 @@ SELECT * FROM result_tbl ORDER BY a;
 (12 rows)
 
 DELETE FROM result_tbl;
+RESET enable_incremental_sort;
+RESET enable_sort;
+ALTER SERVER loopback2 OPTIONS (DROP fdw_startup_cost);
 -- Disable async execution if we use gating Result nodes for pseudoconstant
 -- quals
 EXPLAIN (VERBOSE, COSTS OFF)
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 075da4ff86..e638402e16 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3733,6 +3733,12 @@ DROP INDEX base_tbl2_idx;
 DROP INDEX async_p3_idx;
 
 -- UNION queries
+SET enable_sort TO off;
+SET enable_incremental_sort TO off;
+-- Adjust fdw_startup_cost so that we get an unordered path in the Append.
+-- XXX or is it better to just accept the plan change?
+ALTER SERVER loopback2 OPTIONS (ADD fdw_startup_cost '0.00');
+
 EXPLAIN (VERBOSE, COSTS OFF)
 INSERT INTO result_tbl
 (SELECT a, b, 'AAA' || c FROM async_p1 ORDER BY a LIMIT 10)
@@ -3759,6 +3765,10 @@ UNION ALL
 SELECT * FROM result_tbl ORDER BY a;
 DELETE FROM result_tbl;
 
+RESET enable_incremental_sort;
+RESET enable_sort;
+ALTER SERVER loopback2 OPTIONS (DROP fdw_startup_cost);
+
 -- Disable async execution if we use gating Result nodes for pseudoconstant
 -- quals
 EXPLAIN (VERBOSE, COSTS OFF)
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 7fa502d6e2..740c0c530d 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -2856,6 +2856,55 @@ add_child_join_rel_equivalences(PlannerInfo *root,
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * add_setop_child_rel_equivalences
+ *		Add equivalence members for each non-resjunk target in 'child_tlist'
+ *		to the EquivalenceClass in the corresponding setop_pathkey's pk_class.
+ *
+ * 'root' is the PlannerInfo belonging to the top-level set operation.
+ * 'child_relids' are the Relids of the child relation we're adding
+ * EquivalenceMembers for.
+ * 'child_tlist' is the target list for the setop child relation.  The target
+ * list expressions are what we add as EquivalenceMembers.
+ * 'setop_pathkeys' is a list of PathKeys which must contain an entry for each
+ * non-resjunk target in 'child_tlist'.
+ */
+void
+add_setop_child_rel_equivalences(PlannerInfo *root, Relids child_relids,
+								 List *child_tlist, List *setop_pathkeys)
+{
+	ListCell   *lc;
+	ListCell   *lc2 = list_head(setop_pathkeys);
+
+	foreach(lc, child_tlist)
+	{
+		TargetEntry *tle = lfirst_node(TargetEntry, lc);
+		PathKey    *pk;
+
+		if (tle->resjunk)
+			continue;
+
+		if (lc2 == NULL)
+			elog(ERROR, "too few pathkeys for set operation");
+
+		pk = lfirst_node(PathKey, lc2);
+
+		/*
+		 * We can safely pass the parent member as the first member in the
+		 * members list as this is added first in build_setop_pathkeys. XXX,
+		 * yeah but is that a good assumption to make?
+		 */
+		add_eq_member(pk->pk_eclass,
+					  tle->expr,
+					  child_relids,
+					  NULL,		/* XXX need to fake up a join domain? */
+					  linitial(pk->pk_eclass->ec_members),
+					  exprType((Node *) tle->expr));
+
+		lc2 = lnext(setop_pathkeys, lc2);
+	}
+}
+
 
 /*
  * generate_implied_equalities_for_column
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index fdb60aaa8d..bb6278fff4 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -786,6 +786,71 @@ build_partition_pathkeys(PlannerInfo *root, RelOptInfo *partrel,
 	return retval;
 }
 
+/*
+ * build_setop_pathkeys
+ *	  Build and return a list of PathKeys, one for each non-junk target in
+ *	  'tlist'.
+ */
+List *
+build_setop_pathkeys(PlannerInfo *root, SetOperationStmt *op,
+					 Relids relids, List *tlist)
+{
+	ListCell   *lc;
+	ListCell   *sgccell = list_head(op->groupClauses);
+	List	   *retval = NIL;
+
+	foreach(lc, tlist)
+	{
+		TargetEntry *tle = lfirst_node(TargetEntry, lc);
+		SortGroupClause *sgc;
+
+		Oid			opfamily;
+		Oid			opcintype;
+		int16		strategy;
+		PathKey    *cpathkey;
+
+		if (tle->resjunk)
+			continue;
+
+		/*
+		 * XXX query_is_distinct_for() is happy to Assert this, should this do
+		 * that rather than ERROR?
+		 */
+		if (sgccell == NULL)
+			elog(ERROR, "too few group clauses");
+
+		sgc = lfirst_node(SortGroupClause, sgccell);
+
+		/* Find the operator in pg_amop --- failure shouldn't happen */
+		if (!get_ordering_op_properties(sgc->sortop,
+										&opfamily, &opcintype, &strategy))
+			elog(ERROR, "operator %u is not a valid ordering operator",
+				 sgc->eqop);
+
+		cpathkey = make_pathkey_from_sortinfo(root,
+											  tle->expr,
+											  opfamily,
+											  opcintype,
+											  exprCollation((Node *) tle->expr),
+											  false,
+											  sgc->nulls_first,
+											  0,
+											  relids,
+											  true);
+		retval = lappend(retval, cpathkey);
+		sgccell = lnext(op->groupClauses, sgccell);
+
+		/*
+		 * There's no need to look for redundant pathkeys as set operations
+		 * have no ability to have non-child constants in an EquivalenceClass.
+		 * Let's just make sure that remains true.
+		 */
+		Assert(!EC_MUST_BE_REDUNDANT(cpathkey->pk_eclass));
+	}
+
+	return retval;
+}
+
 /*
  * build_expression_pathkey
  *	  Build a pathkeys list that describes an ordering by a single expression
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index a8cea5efe1..479110bed4 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -126,12 +126,16 @@ typedef struct
 {
 	List	   *activeWindows;	/* active windows, if any */
 	grouping_sets_data *gset_data;	/* grouping sets data, if any */
+	SetOperationStmt *setop;	/* parent set operation or NULL if not a
+								 * subquery belonging to a set operation */
 } standard_qp_extra;
 
 /* Local functions */
 static Node *preprocess_expression(PlannerInfo *root, Node *expr, int kind);
 static void preprocess_qual_conditions(PlannerInfo *root, Node *jtnode);
 static void grouping_planner(PlannerInfo *root, double tuple_fraction);
+static Path *build_final_modify_table_path(PlannerInfo *root,
+										   RelOptInfo *final_rel, Path *path);
 static grouping_sets_data *preprocess_grouping_sets(PlannerInfo *root);
 static List *remap_to_groupclause_idx(List *groupClause, List *gsets,
 									  int *tleref_to_colnum_map);
@@ -1505,6 +1509,18 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 		qp_extra.activeWindows = activeWindows;
 		qp_extra.gset_data = gset_data;
 
+		/*
+		 * Check if we're a subquery for a set operation.  If we are, store
+		 * the SetOperationStmt in qp_extra.
+		 */
+		if (root->parent_root != NULL &&
+			root->parent_root->parse->setOperations != NULL &&
+			IsA(root->parent_root->parse->setOperations, SetOperationStmt))
+			qp_extra.setop =
+				(SetOperationStmt *) root->parent_root->parse->setOperations;
+		else
+			qp_extra.setop = NULL;
+
 		/*
 		 * Generate the best unsorted and presorted paths for the scan/join
 		 * portion of this Query, ie the processing represented by the
@@ -1753,6 +1769,13 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 	foreach(lc, current_rel->pathlist)
 	{
 		Path	   *path = (Path *) lfirst(lc);
+		Path	   *input_path;
+
+		/*
+		 * Keep record of the unmodified path so that we can still tell which
+		 * one is the cheapest_input_path.
+		 */
+		input_path = path;
 
 		/*
 		 * If there is a FOR [KEY] UPDATE/SHARE clause, add the LockRows node.
@@ -1781,191 +1804,86 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 		}
 
 		/*
-		 * If this is an INSERT/UPDATE/DELETE/MERGE, add the ModifyTable node.
+		 * Create paths to suit final sort order required for setop_pathkeys.
+		 * Here we'll sort the cheapest input path (if not sorted already) and
+		 * incremental sort any paths which are partially sorted.  We also
+		 * include the cheapest path as-is so that the set operation can be
+		 * cheaply implemented using a method which does not require the input
+		 * to be sorted.
 		 */
-		if (parse->commandType != CMD_SELECT)
+		if (root->setop_pathkeys != NIL)
 		{
-			Index		rootRelation;
-			List	   *resultRelations = NIL;
-			List	   *updateColnosLists = NIL;
-			List	   *withCheckOptionLists = NIL;
-			List	   *returningLists = NIL;
-			List	   *mergeActionLists = NIL;
-			List	   *rowMarks;
-
-			if (bms_membership(root->all_result_relids) == BMS_MULTIPLE)
-			{
-				/* Inherited UPDATE/DELETE/MERGE */
-				RelOptInfo *top_result_rel = find_base_rel(root,
-														   parse->resultRelation);
-				int			resultRelation = -1;
+			Path	   *cheapest_input_path = current_rel->cheapest_total_path;
+			bool		is_sorted;
+			int			presorted_keys;
 
-				/* Pass the root result rel forward to the executor. */
-				rootRelation = parse->resultRelation;
+			is_sorted = pathkeys_count_contained_in(root->setop_pathkeys,
+													path->pathkeys,
+													&presorted_keys);
 
-				/* Add only leaf children to ModifyTable. */
-				while ((resultRelation = bms_next_member(root->leaf_result_relids,
-														 resultRelation)) >= 0)
+			if (!is_sorted)
+			{
+				/*
+				 * Try at least sorting the cheapest path and also try
+				 * incrementally sorting any path which is partially sorted
+				 * already (no need to deal with paths which have presorted
+				 * keys when incremental sort is disabled unless it's the
+				 * cheapest input path).
+				 */
+				if (input_path != cheapest_input_path)
 				{
-					RelOptInfo *this_result_rel = find_base_rel(root,
-																resultRelation);
-
-					/*
-					 * Also exclude any leaf rels that have turned dummy since
-					 * being added to the list, for example, by being excluded
-					 * by constraint exclusion.
-					 */
-					if (IS_DUMMY_REL(this_result_rel))
+					if (presorted_keys == 0 || !enable_incremental_sort)
 						continue;
-
-					/* Build per-target-rel lists needed by ModifyTable */
-					resultRelations = lappend_int(resultRelations,
-												  resultRelation);
-					if (parse->commandType == CMD_UPDATE)
-					{
-						List	   *update_colnos = root->update_colnos;
-
-						if (this_result_rel != top_result_rel)
-							update_colnos =
-								adjust_inherited_attnums_multilevel(root,
-																	update_colnos,
-																	this_result_rel->relid,
-																	top_result_rel->relid);
-						updateColnosLists = lappend(updateColnosLists,
-													update_colnos);
-					}
-					if (parse->withCheckOptions)
-					{
-						List	   *withCheckOptions = parse->withCheckOptions;
-
-						if (this_result_rel != top_result_rel)
-							withCheckOptions = (List *)
-								adjust_appendrel_attrs_multilevel(root,
-																  (Node *) withCheckOptions,
-																  this_result_rel,
-																  top_result_rel);
-						withCheckOptionLists = lappend(withCheckOptionLists,
-													   withCheckOptions);
-					}
-					if (parse->returningList)
-					{
-						List	   *returningList = parse->returningList;
-
-						if (this_result_rel != top_result_rel)
-							returningList = (List *)
-								adjust_appendrel_attrs_multilevel(root,
-																  (Node *) returningList,
-																  this_result_rel,
-																  top_result_rel);
-						returningLists = lappend(returningLists,
-												 returningList);
-					}
-					if (parse->mergeActionList)
-					{
-						ListCell   *l;
-						List	   *mergeActionList = NIL;
-
-						/*
-						 * Copy MergeActions and translate stuff that
-						 * references attribute numbers.
-						 */
-						foreach(l, parse->mergeActionList)
-						{
-							MergeAction *action = lfirst(l),
-									   *leaf_action = copyObject(action);
-
-							leaf_action->qual =
-								adjust_appendrel_attrs_multilevel(root,
-																  (Node *) action->qual,
-																  this_result_rel,
-																  top_result_rel);
-							leaf_action->targetList = (List *)
-								adjust_appendrel_attrs_multilevel(root,
-																  (Node *) action->targetList,
-																  this_result_rel,
-																  top_result_rel);
-							if (leaf_action->commandType == CMD_UPDATE)
-								leaf_action->updateColnos =
-									adjust_inherited_attnums_multilevel(root,
-																		action->updateColnos,
-																		this_result_rel->relid,
-																		top_result_rel->relid);
-							mergeActionList = lappend(mergeActionList,
-													  leaf_action);
-						}
-
-						mergeActionLists = lappend(mergeActionLists,
-												   mergeActionList);
-					}
 				}
-
-				if (resultRelations == NIL)
+				else
 				{
 					/*
-					 * We managed to exclude every child rel, so generate a
-					 * dummy one-relation plan using info for the top target
-					 * rel (even though that may not be a leaf target).
-					 * Although it's clear that no data will be updated or
-					 * deleted, we still need to have a ModifyTable node so
-					 * that any statement triggers will be executed.  (This
-					 * could be cleaner if we fixed nodeModifyTable.c to allow
-					 * zero target relations, but that probably wouldn't be a
-					 * net win.)
+					 * If this is an INSERT/UPDATE/DELETE/MERGE, add a
+					 * ModifyTable path.
 					 */
-					resultRelations = list_make1_int(parse->resultRelation);
-					if (parse->commandType == CMD_UPDATE)
-						updateColnosLists = list_make1(root->update_colnos);
-					if (parse->withCheckOptions)
-						withCheckOptionLists = list_make1(parse->withCheckOptions);
-					if (parse->returningList)
-						returningLists = list_make1(parse->returningList);
-					if (parse->mergeActionList)
-						mergeActionLists = list_make1(parse->mergeActionList);
-				}
-			}
-			else
-			{
-				/* Single-relation INSERT/UPDATE/DELETE/MERGE. */
-				rootRelation = 0;	/* there's no separate root rel */
-				resultRelations = list_make1_int(parse->resultRelation);
-				if (parse->commandType == CMD_UPDATE)
-					updateColnosLists = list_make1(root->update_colnos);
-				if (parse->withCheckOptions)
-					withCheckOptionLists = list_make1(parse->withCheckOptions);
-				if (parse->returningList)
-					returningLists = list_make1(parse->returningList);
-				if (parse->mergeActionList)
-					mergeActionLists = list_make1(parse->mergeActionList);
-			}
+					if (parse->commandType != CMD_SELECT)
+						path = build_final_modify_table_path(root,
+															 final_rel,
+															 path);
 
-			/*
-			 * If there was a FOR [KEY] UPDATE/SHARE clause, the LockRows node
-			 * will have dealt with fetching non-locked marked rows, else we
-			 * need to have ModifyTable do that.
-			 */
-			if (parse->rowMarks)
-				rowMarks = NIL;
-			else
-				rowMarks = root->rowMarks;
+					/*
+					 * The union planner might want to try a hash-based method
+					 * of executing the set operation, so let's provide the
+					 * cheapest input path so that it can do that as cheaply
+					 * as possible.  If the cheapest_input_path happens to be
+					 * already correctly sorted then we'll add it to final_rel
+					 * at the end of the loop.
+					 */
+					add_path(final_rel, path);
+				}
 
-			path = (Path *)
-				create_modifytable_path(root, final_rel,
-										path,
-										parse->commandType,
-										parse->canSetTag,
-										parse->resultRelation,
-										rootRelation,
-										root->partColsUpdated,
-										resultRelations,
-										updateColnosLists,
-										withCheckOptionLists,
-										returningLists,
-										rowMarks,
-										parse->onConflict,
-										mergeActionLists,
-										assign_special_exec_param(root));
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 final_rel,
+													 path,
+													 root->setop_pathkeys,
+													 limit_tuples);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 final_rel,
+																 path,
+																 root->setop_pathkeys,
+																 presorted_keys,
+																 limit_tuples);
+			}
 		}
 
+		/*
+		 * If this is an INSERT/UPDATE/DELETE/MERGE, add a ModifyTable path.
+		 */
+		if (parse->commandType != CMD_SELECT)
+			path = build_final_modify_table_path(root, final_rel, path);
+
 		/* And shove it into final_rel */
 		add_path(final_rel, path);
 	}
@@ -1978,12 +1896,125 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 		!limit_needed(parse))
 	{
 		Assert(!parse->rowMarks && parse->commandType == CMD_SELECT);
+
+		/*
+		 * XXX adding all of these seems excessive. Why not just the cheapest?
+		 */
 		foreach(lc, current_rel->partial_pathlist)
 		{
 			Path	   *partial_path = (Path *) lfirst(lc);
 
 			add_partial_path(final_rel, partial_path);
 		}
+
+		/*
+		 * When planning for set operations, try sorting the cheapest partial
+		 * path.
+		 */
+		if (root->setop_pathkeys != NIL &&
+			current_rel->partial_pathlist != NIL)
+		{
+			Path	   *cheapest_partial_path;
+
+			cheapest_partial_path = linitial(current_rel->partial_pathlist);
+
+			/*
+			 * If cheapest partial path doesn't need a sort, this is redundant
+			 * with what's already been tried.
+			 */
+			if (!pathkeys_contained_in(root->setop_pathkeys,
+									   cheapest_partial_path->pathkeys))
+			{
+				Path	   *path;
+				double		total_groups;
+
+				path = (Path *) create_sort_path(root,
+												 final_rel,
+												 cheapest_partial_path,
+												 root->setop_pathkeys,
+												 limit_tuples);
+
+				total_groups = cheapest_partial_path->rows *
+					cheapest_partial_path->parallel_workers;
+				path = (Path *) create_gather_merge_path(root,
+														 final_rel,
+														 path,
+														 path->pathtarget,
+														 root->setop_pathkeys,
+														 NULL,
+														 &total_groups);
+
+				add_path(final_rel, path);
+			}
+
+			/*
+			 * Consider incremental sort with a gather merge on partial paths.
+			 *
+			 * We can also skip the entire loop when we only have a single
+			 * pathkey in setop_pathkeys because we can't have partially
+			 * sorted paths when there's just a single pathkey to sort by.
+			 */
+			if (enable_incremental_sort &&
+				list_length(root->setop_pathkeys) > 1)
+			{
+				foreach(lc, current_rel->partial_pathlist)
+				{
+					Path	   *input_path = (Path *) lfirst(lc);
+					Path	   *sorted_path;
+					bool		is_sorted;
+					int			presorted_keys;
+					double		total_groups;
+
+					/*
+					 * We don't care if this is the cheapest partial path - we
+					 * can't simply skip it, because it may be partially
+					 * sorted in which case we want to consider adding
+					 * incremental sort (instead of full sort, which is what
+					 * happens above).
+					 */
+
+					is_sorted =
+						pathkeys_count_contained_in(root->setop_pathkeys,
+													input_path->pathkeys,
+													&presorted_keys);
+
+					/*
+					 * No point in adding incremental sort on fully sorted
+					 * paths.
+					 */
+					if (is_sorted)
+						continue;
+
+					if (presorted_keys == 0)
+						continue;
+
+					/*
+					 * Since we have presorted keys, consider incremental
+					 * sort.
+					 */
+					sorted_path = (Path *)
+						create_incremental_sort_path(root,
+													 final_rel,
+													 input_path,
+													 root->setop_pathkeys,
+													 presorted_keys,
+													 limit_tuples);
+					total_groups =
+						input_path->rows * input_path->parallel_workers;
+					sorted_path = (Path *)
+						create_gather_merge_path(root,
+												 final_rel,
+												 sorted_path,
+												 sorted_path->pathtarget,
+												 root->setop_pathkeys,
+												 NULL,
+												 &total_groups);
+
+					add_path(final_rel, sorted_path);
+				}
+			}
+
+		}
 	}
 
 	extra.limit_needed = limit_needed(parse);
@@ -2009,6 +2040,187 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 	/* Note: currently, we leave it to callers to do set_cheapest() */
 }
 
+static Path *
+build_final_modify_table_path(PlannerInfo *root, RelOptInfo *final_rel,
+							  Path *path)
+{
+	Query	   *parse = root->parse;
+	Index		rootRelation;
+	List	   *resultRelations = NIL;
+	List	   *updateColnosLists = NIL;
+	List	   *withCheckOptionLists = NIL;
+	List	   *returningLists = NIL;
+	List	   *mergeActionLists = NIL;
+	List	   *rowMarks;
+
+	if (bms_membership(root->all_result_relids) == BMS_MULTIPLE)
+	{
+		/* Inherited UPDATE/DELETE/MERGE */
+		RelOptInfo *top_result_rel = find_base_rel(root,
+												   parse->resultRelation);
+		int			resultRelation = -1;
+
+		/* Pass the root result rel forward to the executor. */
+		rootRelation = parse->resultRelation;
+
+		/* Add only leaf children to ModifyTable. */
+		while ((resultRelation = bms_next_member(root->leaf_result_relids,
+												 resultRelation)) >= 0)
+		{
+			RelOptInfo *this_result_rel = find_base_rel(root, resultRelation);
+
+			/*
+			 * Also exclude any leaf rels that have turned dummy since being
+			 * added to the list, for example, by being excluded by constraint
+			 * exclusion.
+			 */
+			if (IS_DUMMY_REL(this_result_rel))
+				continue;
+
+			/* Build per-target-rel lists needed by ModifyTable */
+			resultRelations = lappend_int(resultRelations, resultRelation);
+			if (parse->commandType == CMD_UPDATE)
+			{
+				List	   *update_colnos = root->update_colnos;
+
+				if (this_result_rel != top_result_rel)
+					update_colnos =
+						adjust_inherited_attnums_multilevel(root,
+															update_colnos,
+															this_result_rel->relid,
+															top_result_rel->relid);
+				updateColnosLists = lappend(updateColnosLists, update_colnos);
+			}
+			if (parse->withCheckOptions)
+			{
+				List	   *withCheckOptions = parse->withCheckOptions;
+
+				if (this_result_rel != top_result_rel)
+					withCheckOptions = (List *)
+						adjust_appendrel_attrs_multilevel(root,
+														  (Node *) withCheckOptions,
+														  this_result_rel,
+														  top_result_rel);
+				withCheckOptionLists = lappend(withCheckOptionLists,
+											   withCheckOptions);
+			}
+			if (parse->returningList)
+			{
+				List	   *returningList = parse->returningList;
+
+				if (this_result_rel != top_result_rel)
+					returningList = (List *)
+						adjust_appendrel_attrs_multilevel(root,
+														  (Node *) returningList,
+														  this_result_rel,
+														  top_result_rel);
+				returningLists = lappend(returningLists, returningList);
+			}
+			if (parse->mergeActionList)
+			{
+				ListCell   *l;
+				List	   *mergeActionList = NIL;
+
+				/*
+				 * Copy MergeActions and translate stuff that references
+				 * attribute numbers.
+				 */
+				foreach(l, parse->mergeActionList)
+				{
+					MergeAction *action = lfirst(l),
+							   *leaf_action = copyObject(action);
+
+					leaf_action->qual =
+						adjust_appendrel_attrs_multilevel(root,
+														  (Node *) action->qual,
+														  this_result_rel,
+														  top_result_rel);
+					leaf_action->targetList = (List *)
+						adjust_appendrel_attrs_multilevel(root,
+														  (Node *) action->targetList,
+														  this_result_rel,
+														  top_result_rel);
+					if (leaf_action->commandType == CMD_UPDATE)
+						leaf_action->updateColnos =
+							adjust_inherited_attnums_multilevel(root,
+																action->updateColnos,
+																this_result_rel->relid,
+																top_result_rel->relid);
+					mergeActionList = lappend(mergeActionList, leaf_action);
+				}
+
+				mergeActionLists = lappend(mergeActionLists, mergeActionList);
+			}
+		}
+
+		if (resultRelations == NIL)
+		{
+			/*
+			 * We managed to exclude every child rel, so generate a dummy
+			 * one-relation plan using info for the top target rel (even
+			 * though that may not be a leaf target). Although it's clear that
+			 * no data will be updated or deleted, we still need to have a
+			 * ModifyTable node so that any statement triggers will be
+			 * executed.  (This could be cleaner if we fixed nodeModifyTable.c
+			 * to allow zero target relations, but that probably wouldn't be a
+			 * net win.)
+			 */
+			resultRelations = list_make1_int(parse->resultRelation);
+			if (parse->commandType == CMD_UPDATE)
+				updateColnosLists = list_make1(root->update_colnos);
+			if (parse->withCheckOptions)
+				withCheckOptionLists = list_make1(parse->withCheckOptions);
+			if (parse->returningList)
+				returningLists = list_make1(parse->returningList);
+			if (parse->mergeActionList)
+				mergeActionLists = list_make1(parse->mergeActionList);
+		}
+	}
+	else
+	{
+		/* Single-relation INSERT/UPDATE/DELETE/MERGE. */
+		rootRelation = 0;		/* there's no separate root rel */
+		resultRelations = list_make1_int(parse->resultRelation);
+		if (parse->commandType == CMD_UPDATE)
+			updateColnosLists = list_make1(root->update_colnos);
+		if (parse->withCheckOptions)
+			withCheckOptionLists = list_make1(parse->withCheckOptions);
+		if (parse->returningList)
+			returningLists = list_make1(parse->returningList);
+		if (parse->mergeActionList)
+			mergeActionLists = list_make1(parse->mergeActionList);
+	}
+
+	/*
+	 * If there was a FOR [KEY] UPDATE/SHARE clause, the LockRows node will
+	 * have dealt with fetching non-locked marked rows, else we need to have
+	 * ModifyTable do that.
+	 */
+	if (parse->rowMarks)
+		rowMarks = NIL;
+	else
+		rowMarks = root->rowMarks;
+
+	path = (Path *) create_modifytable_path(root,
+											final_rel,
+											path,
+											parse->commandType,
+											parse->canSetTag,
+											parse->resultRelation,
+											rootRelation,
+											root->partColsUpdated,
+											resultRelations,
+											updateColnosLists,
+											withCheckOptionLists,
+											returningLists,
+											rowMarks,
+											parse->onConflict,
+											mergeActionLists,
+											assign_special_exec_param(root));
+
+	return path;
+}
+
 /*
  * Do preprocessing for groupingSets clause and related data.  This handles the
  * preliminary steps of expanding the grouping sets, organizing them into lists
@@ -3524,6 +3736,38 @@ standard_qp_callback(PlannerInfo *root, void *extra)
 									  parse->sortClause,
 									  tlist);
 
+	/* setting setop_pathkeys might be useful to the union planner */
+	if (qp_extra->setop != NULL &&
+		set_operation_ordered_results_useful(qp_extra->setop))
+	{
+		List	   *groupClauses;
+		ListCell   *lc;
+		bool		sortable;
+
+		foreach(lc, root->processed_tlist)
+		{
+			TargetEntry *tle = lfirst_node(TargetEntry, lc);
+
+			if (tle->resjunk)
+				continue;
+
+			tle->ressortgroupref = tle->resno;
+		}
+
+		groupClauses = generate_setop_grouplist(qp_extra->setop, tlist);
+
+		root->setop_pathkeys =
+			make_pathkeys_for_sortclauses_extended(root,
+												   &groupClauses,
+												   tlist,
+												   false,
+												   &sortable);
+		if (!sortable)
+			root->setop_pathkeys = NIL;
+	}
+	else
+		root->setop_pathkeys = NIL;
+
 	/*
 	 * Figure out whether we want a sorted result from query_planner.
 	 *
@@ -3533,7 +3777,9 @@ standard_qp_callback(PlannerInfo *root, void *extra)
 	 * sortable DISTINCT clause that's more rigorous than the ORDER BY clause,
 	 * we try to produce output that's sufficiently well sorted for the
 	 * DISTINCT.  Otherwise, if there is an ORDER BY clause, we want to sort
-	 * by the ORDER BY clause.
+	 * by the ORDER BY clause.  Otherwise, if we're a subquery being planned
+	 * for a set operation with a sortable targetlist, we want to sort by the
+	 * target list.
 	 *
 	 * Note: if we have both ORDER BY and GROUP BY, and ORDER BY is a superset
 	 * of GROUP BY, it would be tempting to request sort by ORDER BY --- but
@@ -3551,6 +3797,8 @@ standard_qp_callback(PlannerInfo *root, void *extra)
 		root->query_pathkeys = root->distinct_pathkeys;
 	else if (root->sort_pathkeys)
 		root->query_pathkeys = root->sort_pathkeys;
+	else if (root->setop_pathkeys != NIL)
+		root->query_pathkeys = root->setop_pathkeys;
 	else
 		root->query_pathkeys = NIL;
 }
@@ -5306,11 +5554,8 @@ create_ordered_paths(PlannerInfo *root,
 		(*create_upper_paths_hook) (root, UPPERREL_ORDERED,
 									input_rel, ordered_rel, NULL);
 
-	/*
-	 * No need to bother with set_cheapest here; grouping_planner does not
-	 * need us to do it.
-	 */
-	Assert(ordered_rel->pathlist != NIL);
+	/* Now choose the best path(s) */
+	set_cheapest(ordered_rel);
 
 	return ordered_rel;
 }
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 8eaa734916..6ee828aff5 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -51,11 +51,15 @@ static RelOptInfo *recurse_set_operations(Node *setOp, PlannerInfo *root,
 										  bool junkOK,
 										  int flag, List *refnames_tlist,
 										  List **pTargetList,
+										  bool *istrivial_tlist,
 										  double *pNumGroups);
 static RelOptInfo *generate_recursion_path(SetOperationStmt *setOp,
 										   PlannerInfo *root,
 										   List *refnames_tlist,
 										   List **pTargetList);
+static void build_setop_child_paths(PlannerInfo *root, RelOptInfo *rel,
+									bool trivial_tlist, List *child_tlist,
+									List *interesting_pathkeys);
 static RelOptInfo *generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 										List *refnames_tlist,
 										List **pTargetList);
@@ -65,9 +69,8 @@ static RelOptInfo *generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *ro
 static List *plan_union_children(PlannerInfo *root,
 								 SetOperationStmt *top_union,
 								 List *refnames_tlist,
-								 List **tlist_list);
-static Path *make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
-							   PlannerInfo *root);
+								 List **tlist_list,
+								 List **istrivial_tlist);
 static void postprocess_setop_rel(PlannerInfo *root, RelOptInfo *rel);
 static bool choose_hashed_setop(PlannerInfo *root, List *groupClauses,
 								Path *input_path,
@@ -84,7 +87,6 @@ static List *generate_append_tlist(List *colTypes, List *colCollations,
 								   bool flag,
 								   List *input_tlists,
 								   List *refnames_tlist);
-static List *generate_setop_grouplist(SetOperationStmt *op, List *targetlist);
 
 
 /*
@@ -160,6 +162,8 @@ plan_set_operations(PlannerInfo *root)
 	}
 	else
 	{
+		bool		trivial_tlist;
+
 		/*
 		 * Recurse on setOperations tree to generate paths for set ops. The
 		 * final output paths should have just the column types shown as the
@@ -171,6 +175,7 @@ plan_set_operations(PlannerInfo *root)
 										   true, -1,
 										   leftmostQuery->targetList,
 										   &top_tlist,
+										   &trivial_tlist,
 										   NULL);
 	}
 
@@ -180,6 +185,33 @@ plan_set_operations(PlannerInfo *root)
 	return setop_rel;
 }
 
+/*
+ * set_operation_ordered_results_useful
+ *		Return true if the given SetOperationStmt can be executed by utilizing
+ *		paths that provide sorted input according to the setop's targetlist.
+ *		Returns false when sorted paths are not any more useful then unsorted
+ *		ones.
+ */
+bool
+set_operation_ordered_results_useful(SetOperationStmt *setop)
+{
+	/*
+	 * Paths sorted by the targetlist are useful for UNION as we can opt to
+	 * MergeAppend the sorted paths then Unique them.  Ordered paths are no
+	 * more useful than unordered ones for UNION ALL.
+	 */
+	if (!setop->all && setop->op == SETOP_UNION)
+		return true;
+
+	/*
+	 * EXCEPT / EXCEPT ALL / INTERSECT / INTERSECT ALL could benefit but
+	 * additional path work.
+	 */
+
+	/* sorted paths are not deemed of any additional use for this setop */
+	return false;
+}
+
 /*
  * recurse_set_operations
  *	  Recursively handle one step in a tree of set operations
@@ -200,6 +232,10 @@ plan_set_operations(PlannerInfo *root)
  * the logic in this file depends on flag columns being marked resjunk.
  * Pending a redesign of how that works, this is the easy way out.
  *
+ * 'istrivial_tlist' is an output parameter and is set to true when the
+ * pTargetList can be deemed to be a trivial target list as determined by
+ * generate_setop_tlist().
+ *
  * We don't have to care about typmods here: the only allowed difference
  * between set-op input and output typmods is input is a specific typmod
  * and output is -1, and that does not require a coercion.
@@ -210,9 +246,12 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 					   bool junkOK,
 					   int flag, List *refnames_tlist,
 					   List **pTargetList,
+					   bool *istrivial_tlist,
 					   double *pNumGroups)
 {
-	RelOptInfo *rel = NULL;		/* keep compiler quiet */
+	RelOptInfo *rel;
+
+	*istrivial_tlist = true;	/* for now */
 
 	/* Guard against stack overflow due to overly complex setop nests */
 	check_stack_depth();
@@ -224,8 +263,6 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		Query	   *subquery = rte->subquery;
 		PlannerInfo *subroot;
 		RelOptInfo *final_rel;
-		Path	   *subpath;
-		Path	   *path;
 		List	   *tlist;
 		bool		trivial_tlist;
 
@@ -262,61 +299,8 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 
 		/* Return the fully-fledged tlist to caller, too */
 		*pTargetList = tlist;
-
-		/*
-		 * Mark rel with estimated output rows, width, etc.  Note that we have
-		 * to do this before generating outer-query paths, else
-		 * cost_subqueryscan is not happy.
-		 */
-		set_subquery_size_estimates(root, rel);
-
-		/*
-		 * Since we may want to add a partial path to this relation, we must
-		 * set its consider_parallel flag correctly.
-		 */
+		*istrivial_tlist = trivial_tlist;
 		final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
-		rel->consider_parallel = final_rel->consider_parallel;
-
-		/*
-		 * For the moment, we consider only a single Path for the subquery.
-		 * This should change soon (make it look more like
-		 * set_subquery_pathlist).
-		 */
-		subpath = get_cheapest_fractional_path(final_rel,
-											   root->tuple_fraction);
-
-		/*
-		 * Stick a SubqueryScanPath atop that.
-		 *
-		 * We don't bother to determine the subquery's output ordering since
-		 * it won't be reflected in the set-op result anyhow; so just label
-		 * the SubqueryScanPath with nil pathkeys.  (XXX that should change
-		 * soon too, likely.)
-		 */
-		path = (Path *) create_subqueryscan_path(root, rel, subpath,
-												 trivial_tlist,
-												 NIL, NULL);
-
-		add_path(rel, path);
-
-		/*
-		 * If we have a partial path for the child relation, we can use that
-		 * to build a partial path for this relation.  But there's no point in
-		 * considering any path but the cheapest.
-		 */
-		if (rel->consider_parallel && bms_is_empty(rel->lateral_relids) &&
-			final_rel->partial_pathlist != NIL)
-		{
-			Path	   *partial_subpath;
-			Path	   *partial_path;
-
-			partial_subpath = linitial(final_rel->partial_pathlist);
-			partial_path = (Path *)
-				create_subqueryscan_path(root, rel, partial_subpath,
-										 trivial_tlist,
-										 NIL, NULL);
-			add_partial_path(rel, partial_path);
-		}
 
 		/*
 		 * Estimate number of groups if caller wants it.  If the subquery used
@@ -341,11 +325,11 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 			if (subquery->groupClause || subquery->groupingSets ||
 				subquery->distinctClause ||
 				subroot->hasHavingQual || subquery->hasAggs)
-				*pNumGroups = subpath->rows;
+				*pNumGroups = final_rel->rows;
 			else
 				*pNumGroups = estimate_num_groups(subroot,
 												  get_tlist_exprs(subroot->parse->targetList, false),
-												  subpath->rows,
+												  final_rel->rows,
 												  NULL,
 												  NULL);
 		}
@@ -394,6 +378,7 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 												*pTargetList,
 												refnames_tlist,
 												&trivial_tlist);
+			*istrivial_tlist = trivial_tlist;
 			target = create_pathtarget(root, *pTargetList);
 
 			/* Apply projection to each path */
@@ -424,16 +409,16 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 				lfirst(lc) = path;
 			}
 		}
+		postprocess_setop_rel(root, rel);
 	}
 	else
 	{
 		elog(ERROR, "unrecognized node type: %d",
 			 (int) nodeTag(setOp));
 		*pTargetList = NIL;
+		rel = NULL;				/* keep compiler quiet */
 	}
 
-	postprocess_setop_rel(root, rel);
-
 	return rel;
 }
 
@@ -452,7 +437,9 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 	Path	   *lpath;
 	Path	   *rpath;
 	List	   *lpath_tlist;
+	bool		lpath_trivial_tlist;
 	List	   *rpath_tlist;
+	bool		rpath_trivial_tlist;
 	List	   *tlist;
 	List	   *groupList;
 	double		dNumGroups;
@@ -472,7 +459,10 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 								  false, -1,
 								  refnames_tlist,
 								  &lpath_tlist,
-								  NULL);
+								  &lpath_trivial_tlist, NULL);
+	if (lrel->rtekind == RTE_SUBQUERY)
+		build_setop_child_paths(root, lrel, lpath_trivial_tlist, lpath_tlist,
+								NULL);
 	lpath = lrel->cheapest_total_path;
 	/* The right path will want to look at the left one ... */
 	root->non_recursive_path = lpath;
@@ -481,7 +471,11 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 								  false, -1,
 								  refnames_tlist,
 								  &rpath_tlist,
+								  &rpath_trivial_tlist,
 								  NULL);
+	if (rrel->rtekind == RTE_SUBQUERY)
+		build_setop_child_paths(root, rrel, rpath_trivial_tlist, rpath_tlist,
+								NULL);
 	rpath = rrel->cheapest_total_path;
 	root->non_recursive_path = NULL;
 
@@ -543,6 +537,103 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 	return result_rel;
 }
 
+/*
+ * build_setop_child_paths
+ *		Build paths for the set op child relation denoted by 'rel'.
+ *
+ * If 'interesting_pathkeys' is non-NIL then also include paths that suit
+ * these pathkeys if such paths are present in the rel's subroot
+ * UPPERREL_FINAL relation.
+ */
+static void
+build_setop_child_paths(PlannerInfo *root, RelOptInfo *rel,
+						bool trivial_tlist, List *child_tlist,
+						List *interesting_pathkeys)
+{
+	RelOptInfo *final_rel;
+	ListCell   *lc;
+
+	/* it can't be a set op child rel if it's not a subquery */
+	Assert(rel->rtekind == RTE_SUBQUERY);
+
+	/*
+	 * When sorting, add child rel equivalences so that they're available to
+	 * the MergeAppend.
+	 */
+	if (interesting_pathkeys)
+		add_setop_child_rel_equivalences(root,
+										 rel->relids,
+										 child_tlist,
+										 interesting_pathkeys);
+
+	/*
+	 * Mark rel with estimated output rows, width, etc.  Note that we have to
+	 * do this before generating outer-query paths, else cost_subqueryscan is
+	 * not happy.
+	 */
+	set_subquery_size_estimates(root, rel);
+
+	/*
+	 * Since we may want to add a partial path to this relation, we must set
+	 * its consider_parallel flag correctly.
+	 */
+	final_rel = fetch_upper_rel(rel->subroot, UPPERREL_FINAL, NULL);
+	rel->consider_parallel = final_rel->consider_parallel;
+
+	/*
+	 * For each Path that subquery_planner produced, make a SubqueryScanPath
+	 * in the outer query.
+	 */
+	foreach(lc, final_rel->pathlist)
+	{
+		Path	   *subpath = (Path *) lfirst(lc);
+		List	   *pathkeys;
+
+		/* Convert subpath's pathkeys to outer representation */
+		pathkeys = convert_subquery_pathkeys(root, rel, subpath->pathkeys,
+											 make_tlist_from_pathtarget(subpath->pathtarget));
+
+		/* Generate outer path using this subpath */
+		add_path(rel, (Path *) create_subqueryscan_path(root,
+														rel,
+														subpath,
+														trivial_tlist,
+														pathkeys,
+														NULL));
+	}
+
+	/* if outer rel allows parallelism, do same for partial paths */
+	if (rel->consider_parallel && bms_is_empty(rel->lateral_relids))
+	{
+		/* if consider_parallel is false, there should be no partial paths */
+		Assert(final_rel->consider_parallel ||
+			   final_rel->partial_pathlist == NIL);
+
+		/* generate subquery paths for each partial path */
+		foreach(lc, final_rel->partial_pathlist)
+		{
+			Path	   *subpath = (Path *) lfirst(lc);
+			List	   *pathkeys;
+
+			/* convert subpath's pathkeys to outer representation */
+			pathkeys = convert_subquery_pathkeys(root, rel,
+												 subpath->pathkeys,
+												 make_tlist_from_pathtarget(subpath->pathtarget));
+
+			/* generate outer path using this subpath */
+			add_partial_path(rel,
+							 (Path *) create_subqueryscan_path(root,
+															   rel,
+															   subpath,
+															   trivial_tlist,
+															   pathkeys,
+															   NULL));
+		}
+	}
+
+	postprocess_setop_rel(root, rel);
+}
+
 /*
  * Generate paths for a UNION or UNION ALL node
  */
@@ -553,30 +644,23 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 {
 	Relids		relids = NULL;
 	RelOptInfo *result_rel;
-	double		save_fraction = root->tuple_fraction;
 	ListCell   *lc;
+	ListCell   *lc2;
+	ListCell   *lc3;
 	List	   *pathlist = NIL;
+	List	   *ordered_pathlist = NIL;
 	List	   *partial_pathlist = NIL;
 	bool		partial_paths_valid = true;
 	bool		consider_parallel = true;
 	List	   *rellist;
 	List	   *tlist_list;
+	List	   *trivial_tlist_list;
 	List	   *tlist;
-	Path	   *path;
-
-	/*
-	 * If plain UNION, tell children to fetch all tuples.
-	 *
-	 * Note: in UNION ALL, we pass the top-level tuple_fraction unmodified to
-	 * each arm of the UNION ALL.  One could make a case for reducing the
-	 * tuple fraction for later arms (discounting by the expected size of the
-	 * earlier arms' results) but it seems not worth the trouble. The normal
-	 * case where tuple_fraction isn't already zero is a LIMIT at top level,
-	 * and passing it down as-is is usually enough to get the desired result
-	 * of preferring fast-start plans.
-	 */
-	if (!op->all)
-		root->tuple_fraction = 0.0;
+	List	   *groupList = NIL;
+	Path	   *apath;
+	Path	   *gpath = NULL;
+	bool		try_sorted;
+	List	   *union_pathkeys = NIL;
 
 	/*
 	 * If any of my children are identical UNION nodes (same op, all-flag, and
@@ -584,7 +668,11 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	 * only one Append and unique-ification for the lot.  Recurse to find such
 	 * nodes and compute their children's paths.
 	 */
-	rellist = plan_union_children(root, op, refnames_tlist, &tlist_list);
+	rellist = plan_union_children(root,
+								  op,
+								  refnames_tlist,
+								  &tlist_list,
+								  &trivial_tlist_list);
 
 	/*
 	 * Generate tlist for Append plan node.
@@ -595,16 +683,68 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	 */
 	tlist = generate_append_tlist(op->colTypes, op->colCollations, false,
 								  tlist_list, refnames_tlist);
-
 	*pTargetList = tlist;
 
+	/* For for UNIONs (not UNION ALL), try sorting, if sorting is possible */
+	try_sorted = !op->all && grouping_is_sortable(op->groupClauses);
+
+	if (try_sorted)
+	{
+		/* determine the pathkeys for sorting by the whole target list */
+		union_pathkeys = build_setop_pathkeys(root,
+											  op,
+											  bms_make_singleton(0),
+											  tlist);
+		groupList = generate_setop_grouplist(op, tlist);
+	}
+
+	/*
+	 * Now that we've got the append target list, we can build the union child
+	 * paths.
+	 */
+	forthree(lc, rellist, lc2, trivial_tlist_list, lc3, tlist_list)
+	{
+		RelOptInfo *rel = lfirst(lc);
+		bool		trivial_tlist = lfirst_int(lc2);
+		List	   *child_tlist = lfirst_node(List, lc3);
+
+		/* only build paths for the union children */
+		if (rel->rtekind == RTE_SUBQUERY)
+			build_setop_child_paths(root, rel, trivial_tlist, child_tlist,
+									union_pathkeys);
+	}
+
 	/* Build path lists and relid set. */
 	foreach(lc, rellist)
 	{
 		RelOptInfo *rel = lfirst(lc);
+		Path	   *ordered_path;
+
 
 		pathlist = lappend(pathlist, rel->cheapest_total_path);
 
+		if (try_sorted)
+		{
+			ordered_path = get_cheapest_path_for_pathkeys(rel->pathlist,
+														  union_pathkeys,
+														  NULL,
+														  TOTAL_COST,
+														  false);
+
+			if (ordered_path != NULL)
+				ordered_pathlist = lappend(ordered_pathlist, ordered_path);
+			else
+			{
+				/*
+				 * If we can't find a sorted path, just give up trying to
+				 * generate a list of correctly sorted child paths.  This can
+				 * happen when type coercion was added to the targetlist due
+				 * to mismatching types from the union children.
+				 */
+				try_sorted = false;
+			}
+		}
+
 		if (consider_parallel)
 		{
 			if (!rel->consider_parallel)
@@ -626,28 +766,21 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	result_rel = fetch_upper_rel(root, UPPERREL_SETOP, relids);
 	result_rel->reltarget = create_pathtarget(root, tlist);
 	result_rel->consider_parallel = consider_parallel;
+	result_rel->consider_startup = (root->tuple_fraction > 0);
 
 	/*
-	 * Append the child results together.
+	 * Append the child results together using the cheapest paths from each
+	 * union child.
 	 */
-	path = (Path *) create_append_path(root, result_rel, pathlist, NIL,
-									   NIL, NULL, 0, false, -1);
-
-	/*
-	 * For UNION ALL, we just need the Append path.  For UNION, need to add
-	 * node(s) to remove duplicates.
-	 */
-	if (!op->all)
-		path = make_union_unique(op, path, tlist, root);
-
-	add_path(result_rel, path);
+	apath = (Path *) create_append_path(root, result_rel, pathlist, NIL,
+										NIL, NULL, 0, false, -1);
 
 	/*
 	 * Estimate number of groups.  For now we just assume the output is unique
 	 * --- this is certainly true for the UNION case, and we want worst-case
 	 * estimates anyway.
 	 */
-	result_rel->rows = path->rows;
+	result_rel->rows = apath->rows;
 
 	/*
 	 * Now consider doing the same thing using the partial paths plus Append
@@ -655,7 +788,7 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	 */
 	if (partial_paths_valid)
 	{
-		Path	   *ppath;
+		Path	   *papath;
 		int			parallel_workers = 0;
 
 		/* Find the highest number of workers requested for any subpath. */
@@ -684,21 +817,136 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 		}
 		Assert(parallel_workers > 0);
 
-		ppath = (Path *)
+		papath = (Path *)
 			create_append_path(root, result_rel, NIL, partial_pathlist,
-							   NIL, NULL,
-							   parallel_workers, enable_parallel_append,
-							   -1);
-		ppath = (Path *)
-			create_gather_path(root, result_rel, ppath,
+							   NIL, NULL, parallel_workers,
+							   enable_parallel_append, -1);
+		gpath = (Path *)
+			create_gather_path(root, result_rel, papath,
 							   result_rel->reltarget, NULL, NULL);
-		if (!op->all)
-			ppath = make_union_unique(op, ppath, tlist, root);
-		add_path(result_rel, ppath);
 	}
 
-	/* Undo effects of possibly forcing tuple_fraction to 0 */
-	root->tuple_fraction = save_fraction;
+	if (!op->all)
+	{
+		double		dNumGroups;
+		bool		can_sort = grouping_is_sortable(groupList);
+		bool		can_hash = grouping_is_hashable(groupList);
+
+		/*
+		 * XXX for the moment, take the number of distinct groups as equal to
+		 * the total input size, ie, the worst case.  This is too
+		 * conservative, but it's not clear how to get a decent estimate of
+		 * the true size.  One should note as well the propensity of novices
+		 * to write UNION rather than UNION ALL even when they don't expect
+		 * any duplicates...
+		 */
+		dNumGroups = apath->rows;
+
+		if (can_hash)
+		{
+			Path	   *path;
+
+			/*
+			 * Try a hash aggregate plan on 'apath'.  This is the cheapest
+			 * available path containing each append child.
+			 */
+			path = (Path *) create_agg_path(root,
+											result_rel,
+											apath,
+											create_pathtarget(root, tlist),
+											AGG_HASHED,
+											AGGSPLIT_SIMPLE,
+											groupList,
+											NIL,
+											NULL,
+											dNumGroups);
+			add_path(result_rel, path);
+
+			/* Try hash aggregate on the Gather path, if valid */
+			if (gpath != NULL)
+			{
+				/* Hashed aggregate plan --- no sort needed */
+				path = (Path *) create_agg_path(root,
+												result_rel,
+												gpath,
+												create_pathtarget(root, tlist),
+												AGG_HASHED,
+												AGGSPLIT_SIMPLE,
+												groupList,
+												NIL,
+												NULL,
+												dNumGroups);
+				add_path(result_rel, path);
+			}
+		}
+
+		if (can_sort)
+		{
+			Path	   *path = apath;
+
+			if (groupList != NIL)
+				path = (Path *) create_sort_path(root, result_rel, path,
+												 make_pathkeys_for_sortclauses(root, groupList, tlist),
+												 -1.0);
+
+			path = (Path *) create_upper_unique_path(root,
+													 result_rel,
+													 path,
+													 list_length(path->pathkeys),
+													 dNumGroups);
+
+			add_path(result_rel, path);
+
+			/* Try Sort -> Unique on the Gather path, if set */
+			if (gpath != NULL)
+			{
+				path = gpath;
+
+				path = (Path *) create_sort_path(root, result_rel, path,
+												 make_pathkeys_for_sortclauses(root, groupList, tlist),
+												 -1.0);
+
+				path = (Path *) create_upper_unique_path(root,
+														 result_rel,
+														 path,
+														 list_length(path->pathkeys),
+														 dNumGroups);
+				add_path(result_rel, path);
+			}
+		}
+
+		/*
+		 * Try making a MergeAppend path if we managed to find a path with the
+		 * correct pathkeys for each union child.
+		 */
+		if (try_sorted && groupList != NIL)
+		{
+			Path	   *path;
+
+			path = (Path *) create_merge_append_path(root,
+													 result_rel,
+													 ordered_pathlist,
+													 union_pathkeys,
+													 NULL);
+
+			/* and make the MergeAppend unique */
+			path = (Path *) create_upper_unique_path(root,
+													 result_rel,
+													 path,
+													 list_length(tlist),
+													 path->rows);
+
+			add_path(result_rel, path);
+		}
+	}
+	else
+	{
+		/* UNION ALL */
+		add_path(result_rel, apath);
+
+		if (gpath != NULL)
+			add_path(result_rel, gpath);
+	}
 
 	return result_rel;
 }
@@ -724,6 +972,8 @@ generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *root,
 			   *tlist,
 			   *groupList,
 			   *pathlist;
+	bool		lpath_trivial_tlist,
+				rpath_trivial_tlist;
 	double		dLeftGroups,
 				dRightGroups,
 				dNumGroups,
@@ -743,14 +993,22 @@ generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *root,
 								  false, 0,
 								  refnames_tlist,
 								  &lpath_tlist,
+								  &lpath_trivial_tlist,
 								  &dLeftGroups);
+	if (lrel->rtekind == RTE_SUBQUERY)
+		build_setop_child_paths(root, lrel, lpath_trivial_tlist, lpath_tlist,
+								NULL);
 	lpath = lrel->cheapest_total_path;
 	rrel = recurse_set_operations(op->rarg, root,
 								  op->colTypes, op->colCollations,
 								  false, 1,
 								  refnames_tlist,
 								  &rpath_tlist,
+								  &rpath_trivial_tlist,
 								  &dRightGroups);
+	if (rrel->rtekind == RTE_SUBQUERY)
+		build_setop_child_paths(root, rrel, rpath_trivial_tlist, rpath_tlist,
+								NULL);
 	rpath = rrel->cheapest_total_path;
 
 	/* Undo effects of forcing tuple_fraction to 0 */
@@ -887,13 +1145,16 @@ static List *
 plan_union_children(PlannerInfo *root,
 					SetOperationStmt *top_union,
 					List *refnames_tlist,
-					List **tlist_list)
+					List **tlist_list,
+					List **istrivial_tlist)
 {
 	List	   *pending_rels = list_make1(top_union);
 	List	   *result = NIL;
 	List	   *child_tlist;
+	bool		trivial_tlist;
 
 	*tlist_list = NIL;
+	*istrivial_tlist = NIL;
 
 	while (pending_rels != NIL)
 	{
@@ -932,75 +1193,15 @@ plan_union_children(PlannerInfo *root,
 														false, -1,
 														refnames_tlist,
 														&child_tlist,
+														&trivial_tlist,
 														NULL));
 		*tlist_list = lappend(*tlist_list, child_tlist);
+		*istrivial_tlist = lappend_int(*istrivial_tlist, trivial_tlist);
 	}
 
 	return result;
 }
 
-/*
- * Add nodes to the given path tree to unique-ify the result of a UNION.
- */
-static Path *
-make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
-				  PlannerInfo *root)
-{
-	RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
-	List	   *groupList;
-	double		dNumGroups;
-
-	/* Identify the grouping semantics */
-	groupList = generate_setop_grouplist(op, tlist);
-
-	/*
-	 * XXX for the moment, take the number of distinct groups as equal to the
-	 * total input size, ie, the worst case.  This is too conservative, but
-	 * it's not clear how to get a decent estimate of the true size.  One
-	 * should note as well the propensity of novices to write UNION rather
-	 * than UNION ALL even when they don't expect any duplicates...
-	 */
-	dNumGroups = path->rows;
-
-	/* Decide whether to hash or sort */
-	if (choose_hashed_setop(root, groupList, path,
-							dNumGroups, dNumGroups,
-							"UNION"))
-	{
-		/* Hashed aggregate plan --- no sort needed */
-		path = (Path *) create_agg_path(root,
-										result_rel,
-										path,
-										create_pathtarget(root, tlist),
-										AGG_HASHED,
-										AGGSPLIT_SIMPLE,
-										groupList,
-										NIL,
-										NULL,
-										dNumGroups);
-	}
-	else
-	{
-		/* Sort and Unique */
-		if (groupList)
-			path = (Path *)
-				create_sort_path(root,
-								 result_rel,
-								 path,
-								 make_pathkeys_for_sortclauses(root,
-															   groupList,
-															   tlist),
-								 -1.0);
-		path = (Path *) create_upper_unique_path(root,
-												 result_rel,
-												 path,
-												 list_length(path->pathkeys),
-												 dNumGroups);
-	}
-
-	return path;
-}
-
 /*
  * postprocess_setop_rel - perform steps required after adding paths
  */
@@ -1406,7 +1607,7 @@ generate_append_tlist(List *colTypes, List *colCollations,
  * setop.  So what we need to do here is copy that list and install
  * proper sortgrouprefs into it (copying those from the targetlist).
  */
-static List *
+List *
 generate_setop_grouplist(SetOperationStmt *op, List *targetlist)
 {
 	List	   *grouplist = copyObject(op->groupClauses);
@@ -1420,11 +1621,7 @@ generate_setop_grouplist(SetOperationStmt *op, List *targetlist)
 		SortGroupClause *sgc;
 
 		if (tle->resjunk)
-		{
-			/* resjunk columns should not have sortgrouprefs */
-			Assert(tle->ressortgroupref == 0);
 			continue;			/* ignore resjunk columns */
-		}
 
 		/* non-resjunk columns should have sortgroupref = resno */
 		Assert(tle->ressortgroupref == tle->resno);
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 0b1d17b9d3..66cae83ee1 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -587,11 +587,22 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
 			parent_rel->pathlist = foreach_delete_current(parent_rel->pathlist,
 														  p1);
 
+#ifdef NOT_USED
+
+			/*
+			 * XXX fixme.  When creating the final upper rel paths for queries
+			 * which have setop_pathkey set,  we'll at the cheapest path as-is
+			 * also try sorting the cheapest path. If the costs of each are
+			 * fuzzily the same then we might choose to pfree the cheapest
+			 * path.  That's bad as the sort uses that path.
+			 */
+
 			/*
 			 * Delete the data pointed-to by the deleted cell, if possible
 			 */
 			if (!IsA(old_path, IndexPath))
 				pfree(old_path);
+#endif
 		}
 		else
 		{
@@ -1237,6 +1248,9 @@ create_tidrangescan_path(PlannerInfo *root, RelOptInfo *rel,
  *
  * Note that we must handle subpaths = NIL, representing a dummy access path.
  * Also, there are callers that pass root = NULL.
+ * 'rows', when passed as a non-negative number will be used to overwrite the
+ * returned path's row estimate.  Otherwise the row estimate is calculated
+ * by totaling the rows from 'subpaths'.
  */
 AppendPath *
 create_append_path(PlannerInfo *root,
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index ed85dc7414..1c59e111b6 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -397,6 +397,8 @@ struct PlannerInfo
 	List	   *distinct_pathkeys;
 	/* sortClause pathkeys, if any */
 	List	   *sort_pathkeys;
+	/* set operator pathkeys, if any */
+	List	   *setop_pathkeys;
 
 	/* Canonicalised partition schemes used in the query. */
 	List	   *part_schemes pg_node_attr(read_write_ignore);
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 9e7408c7ec..114e023c3b 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -173,6 +173,9 @@ extern void add_child_join_rel_equivalences(PlannerInfo *root,
 											AppendRelInfo **appinfos,
 											RelOptInfo *parent_joinrel,
 											RelOptInfo *child_joinrel);
+extern void add_setop_child_rel_equivalences(PlannerInfo *root, Relids relids,
+											 List *tlist,
+											 List *setop_pathkeys);
 extern List *generate_implied_equalities_for_column(PlannerInfo *root,
 													RelOptInfo *rel,
 													ec_matches_callback_type callback,
@@ -217,6 +220,8 @@ extern List *build_index_pathkeys(PlannerInfo *root, IndexOptInfo *index,
 								  ScanDirection scandir);
 extern List *build_partition_pathkeys(PlannerInfo *root, RelOptInfo *partrel,
 									  ScanDirection scandir, bool *partialkeys);
+extern List *build_setop_pathkeys(PlannerInfo *root, SetOperationStmt *op,
+								  Relids relids, List *tlist);
 extern List *build_expression_pathkey(PlannerInfo *root, Expr *expr,
 									  Oid opno,
 									  Relids rel, bool create_it);
diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h
index 54fd61c9c3..bc2d716c0c 100644
--- a/src/include/optimizer/prep.h
+++ b/src/include/optimizer/prep.h
@@ -53,6 +53,7 @@ extern void preprocess_aggrefs(PlannerInfo *root, Node *clause);
  * prototypes for prepunion.c
  */
 extern RelOptInfo *plan_set_operations(PlannerInfo *root);
-
+extern bool set_operation_ordered_results_useful(SetOperationStmt *setop);
+extern List *generate_setop_grouplist(SetOperationStmt *op, List *targetlist);
 
 #endif							/* PREP_H */
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index 97bbe53b64..5485bdfb0b 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -1396,6 +1396,7 @@ SELECT x FROM test3cs WHERE x ~ 'a';
  abc
 (1 row)
 
+SET enable_hashagg TO off;
 SELECT x FROM test1cs UNION SELECT x FROM test2cs ORDER BY x;
   x  
 -----
@@ -1448,6 +1449,7 @@ SELECT DISTINCT x FROM test3cs ORDER BY x;
  ghi
 (4 rows)
 
+RESET enable_hashagg;
 SELECT count(DISTINCT x) FROM test3cs;
  count 
 -------
diff --git a/src/test/regress/expected/incremental_sort.out b/src/test/regress/expected/incremental_sort.out
index 7fdb685313..5fd54a10b1 100644
--- a/src/test/regress/expected/incremental_sort.out
+++ b/src/test/regress/expected/incremental_sort.out
@@ -1472,14 +1472,19 @@ explain (costs off) select * from t union select * from t order by 1,3;
    Sort Key: t.a, t.c
    Presorted Key: t.a
    ->  Unique
-         ->  Sort
+         ->  Merge Append
                Sort Key: t.a, t.b, t.c
-               ->  Gather
+               ->  Gather Merge
                      Workers Planned: 2
-                     ->  Parallel Append
+                     ->  Sort
+                           Sort Key: t.a, t.b, t.c
                            ->  Parallel Seq Scan on t
+               ->  Gather Merge
+                     Workers Planned: 2
+                     ->  Sort
+                           Sort Key: t_1.a, t_1.b, t_1.c
                            ->  Parallel Seq Scan on t t_1
-(11 rows)
+(16 rows)
 
 -- Full sort, not just incremental sort can be pushed below a gather merge path
 -- by generate_useful_gather_paths.
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 64cebe4833..5e8143f7ea 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -370,16 +370,16 @@ select count(*) from
 explain (costs off)
 select count(*) from
   ( select unique1 from tenk1 intersect select fivethous from tenk1 ) ss;
-                                     QUERY PLAN                                     
-------------------------------------------------------------------------------------
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
  Aggregate
    ->  Subquery Scan on ss
          ->  HashSetOp Intersect
                ->  Append
-                     ->  Subquery Scan on "*SELECT* 2"
-                           ->  Seq Scan on tenk1
                      ->  Subquery Scan on "*SELECT* 1"
-                           ->  Index Only Scan using tenk1_unique1 on tenk1 tenk1_1
+                           ->  Index Only Scan using tenk1_unique1 on tenk1
+                     ->  Subquery Scan on "*SELECT* 2"
+                           ->  Seq Scan on tenk1 tenk1_1
 (8 rows)
 
 select count(*) from
@@ -412,16 +412,17 @@ set enable_hashagg to off;
 explain (costs off)
 select count(*) from
   ( select unique1 from tenk1 union select fivethous from tenk1 ) ss;
-                              QUERY PLAN                              
-----------------------------------------------------------------------
+                           QUERY PLAN                           
+----------------------------------------------------------------
  Aggregate
    ->  Unique
-         ->  Sort
+         ->  Merge Append
                Sort Key: tenk1.unique1
-               ->  Append
-                     ->  Index Only Scan using tenk1_unique1 on tenk1
+               ->  Index Only Scan using tenk1_unique1 on tenk1
+               ->  Sort
+                     Sort Key: tenk1_1.fivethous
                      ->  Seq Scan on tenk1 tenk1_1
-(7 rows)
+(8 rows)
 
 select count(*) from
   ( select unique1 from tenk1 union select fivethous from tenk1 ) ss;
@@ -433,18 +434,18 @@ select count(*) from
 explain (costs off)
 select count(*) from
   ( select unique1 from tenk1 intersect select fivethous from tenk1 ) ss;
-                                        QUERY PLAN                                        
-------------------------------------------------------------------------------------------
+                                    QUERY PLAN                                    
+----------------------------------------------------------------------------------
  Aggregate
    ->  Subquery Scan on ss
          ->  SetOp Intersect
                ->  Sort
-                     Sort Key: "*SELECT* 2".fivethous
+                     Sort Key: "*SELECT* 1".unique1
                      ->  Append
-                           ->  Subquery Scan on "*SELECT* 2"
-                                 ->  Seq Scan on tenk1
                            ->  Subquery Scan on "*SELECT* 1"
-                                 ->  Index Only Scan using tenk1_unique1 on tenk1 tenk1_1
+                                 ->  Index Only Scan using tenk1_unique1 on tenk1
+                           ->  Subquery Scan on "*SELECT* 2"
+                                 ->  Seq Scan on tenk1 tenk1_1
 (10 rows)
 
 select count(*) from
@@ -950,16 +951,8 @@ select except select;
 -- check hashed implementation
 set enable_hashagg = true;
 set enable_sort = false;
-explain (costs off)
-select from generate_series(1,5) union select from generate_series(1,3);
-                           QUERY PLAN                           
-----------------------------------------------------------------
- HashAggregate
-   ->  Append
-         ->  Function Scan on generate_series
-         ->  Function Scan on generate_series generate_series_1
-(4 rows)
-
+-- We've no way to check hashed UNION as the empty pathkeys in the Append are
+-- fine to make use of Unique, which is cheaper than HashAggregate.
 explain (costs off)
 select from generate_series(1,5) intersect select from generate_series(1,3);
                               QUERY PLAN                              
@@ -972,10 +965,6 @@ select from generate_series(1,5) intersect select from generate_series(1,3);
                ->  Function Scan on generate_series generate_series_1
 (6 rows)
 
-select from generate_series(1,5) union select from generate_series(1,3);
---
-(1 row)
-
 select from generate_series(1,5) union all select from generate_series(1,3);
 --
 (8 rows)
@@ -1102,16 +1091,17 @@ explain (costs off)
   UNION
   SELECT * FROM t2) t
  WHERE ab = 'ab';
-                    QUERY PLAN                     
----------------------------------------------------
- HashAggregate
-   Group Key: ((t1.a || t1.b))
-   ->  Append
-         ->  Index Scan using t1_ab_idx on t1
-               Index Cond: ((a || b) = 'ab'::text)
-         ->  Index Only Scan using t2_pkey on t2
-               Index Cond: (ab = 'ab'::text)
-(7 rows)
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: ((t1.a || t1.b))
+         ->  Append
+               ->  Index Scan using t1_ab_idx on t1
+                     Index Cond: ((a || b) = 'ab'::text)
+               ->  Index Only Scan using t2_pkey on t2
+                     Index Cond: (ab = 'ab'::text)
+(8 rows)
 
 --
 -- Test that ORDER BY for UNION ALL can be pushed down to inheritance
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index 3db9e25913..96d5d6b11c 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -555,6 +555,7 @@ SELECT x FROM test3cs WHERE x LIKE 'a%';
 SELECT x FROM test3cs WHERE x ILIKE 'a%';
 SELECT x FROM test3cs WHERE x SIMILAR TO 'a%';
 SELECT x FROM test3cs WHERE x ~ 'a';
+SET enable_hashagg TO off;
 SELECT x FROM test1cs UNION SELECT x FROM test2cs ORDER BY x;
 SELECT x FROM test2cs UNION SELECT x FROM test1cs ORDER BY x;
 SELECT x FROM test1cs INTERSECT SELECT x FROM test2cs;
@@ -562,6 +563,7 @@ SELECT x FROM test2cs INTERSECT SELECT x FROM test1cs;
 SELECT x FROM test1cs EXCEPT SELECT x FROM test2cs;
 SELECT x FROM test2cs EXCEPT SELECT x FROM test1cs;
 SELECT DISTINCT x FROM test3cs ORDER BY x;
+RESET enable_hashagg;
 SELECT count(DISTINCT x) FROM test3cs;
 SELECT x, count(*) FROM test3cs GROUP BY x ORDER BY x;
 SELECT x, row_number() OVER (ORDER BY x), rank() OVER (ORDER BY x) FROM test3cs ORDER BY x;
diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql
index 599013e7c9..8d1a45c174 100644
--- a/src/test/regress/sql/union.sql
+++ b/src/test/regress/sql/union.sql
@@ -302,12 +302,11 @@ select except select;
 set enable_hashagg = true;
 set enable_sort = false;
 
-explain (costs off)
-select from generate_series(1,5) union select from generate_series(1,3);
+-- We've no way to check hashed UNION as the empty pathkeys in the Append are
+-- fine to make use of Unique, which is cheaper than HashAggregate.
 explain (costs off)
 select from generate_series(1,5) intersect select from generate_series(1,3);
 
-select from generate_series(1,5) union select from generate_series(1,3);
 select from generate_series(1,5) union all select from generate_series(1,3);
 select from generate_series(1,5) intersect select from generate_series(1,3);
 select from generate_series(1,5) intersect all select from generate_series(1,3);

Ashutosh Bapat

ashutosh.bapat.oss@gmail.com

about 2 years ago

In reply to: David Rowley (#2)

Re: Properly pathify the union planner

On Fri, Nov 24, 2023 at 3:59 AM David Rowley <dgrowleyml@gmail.com> wrote:

Another problem I hit was add_path() pfreeing a Path that I needed.
This was happening due to how I'm building the final paths in the
subquery when setop_pathkeys are set. Because I want to always
include the cheapest_input_path to allow that path to be used in
hash-based UNIONs, I also want to provide sorted paths so that
MergeAppend has something to work with. I found cases where I'd
add_path() the cheapest_input_path to the final rel then also attempt
to sort that path. Unfortunately, add_path() found the unsorted path
and the sorted path fuzzily the same cost and opted to keep the sorted
one due to it having better pathkeys. add_path() then pfree'd the
cheapest_input_path which meant the Sort's subpath was gone which
obviously caused issues in createplan.c.

For now, as a temporary fix, I've just #ifdef'd out the code in
add_path() that's pfreeing the old path. I have drafted a patch that
refcounts Paths, but I'm unsure if that's the correct solution as I'm
only maintaining the refcounts in add_path() and add_partial_path(). I
think a true correct solution would bump the refcount when a path is
used as some other path's subpath. That would mean having to
recursively pfree paths up until we find one with a refcount>0. Seems
a bit expensive for add_path() to do.

Please find my proposal to refcount paths at [1]/messages/by-id/CAExHW5tUcVsBkq9qT=L5vYz4e-cwQNw=KAGJrtSyzOp3F=XacA@mail.gmail.com. I did that to reduce
the memory consumed by partitionwise joins. I remember another thread
where freeing a path that was referenced by upper sort path created
minor debugging problem. [2]/messages/by-id/CAM2+6=UC1mcVtM0Y_LEMBEGHTM58HEkqHPn7vau_V_YfuZjEGg@mail.gmail.com. I paused my work on my proposal since
there didn't seem enough justification. But it looks like the
requirement is coming up repeatedly. I am willing to resume my work if
it's needed. The email lists next TODOs. As to making the add_path()
expensive, I didn't find any noticeable impact on planning time.

[1]: /messages/by-id/CAExHW5tUcVsBkq9qT=L5vYz4e-cwQNw=KAGJrtSyzOp3F=XacA@mail.gmail.com
[2]: /messages/by-id/CAM2+6=UC1mcVtM0Y_LEMBEGHTM58HEkqHPn7vau_V_YfuZjEGg@mail.gmail.com

--
Best Wishes,
Ashutosh Bapat

David Rowley

dgrowleyml@gmail.com

about 2 years ago

In reply to: Ashutosh Bapat (#3)

Re: Properly pathify the union planner

On Fri, 24 Nov 2023 at 18:43, Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:

On Fri, Nov 24, 2023 at 3:59 AM David Rowley <dgrowleyml@gmail.com> wrote:

For now, as a temporary fix, I've just #ifdef'd out the code in
add_path() that's pfreeing the old path. I have drafted a patch that
refcounts Paths, but I'm unsure if that's the correct solution as I'm
only maintaining the refcounts in add_path() and add_partial_path(). I
think a true correct solution would bump the refcount when a path is
used as some other path's subpath. That would mean having to
recursively pfree paths up until we find one with a refcount>0. Seems
a bit expensive for add_path() to do.

Please find my proposal to refcount paths at [1]. I did that to reduce
the memory consumed by partitionwise joins. I remember another thread
where freeing a path that was referenced by upper sort path created
minor debugging problem. [2]. I paused my work on my proposal since
there didn't seem enough justification. But it looks like the
requirement is coming up repeatedly. I am willing to resume my work if
it's needed. The email lists next TODOs. As to making the add_path()
expensive, I didn't find any noticeable impact on planning time.

I missed that thread. Thanks for pointing it out.

I skim read your patch and I see it does seem to have the workings for
tracking refcounts when the pack is a subpath of another path. I
imagine that would allow the undocumented hack that is "if
(!IsA(old_path, IndexPath))" in add_path() to disappear.

I wondered if the problem of pfreeing paths that are in the pathlist
of another relation could be fixed in another way. If we have an
AdoptedPath path type that just inherits the costs from its single
subpath and we wrap a Path up in one of these before we do add_path()
a Path which is not parented by the relation we're adding the path to,
since we don't recursively pfree() Paths in add_path(), we'd only ever
pfree the AdoptedPath rather than pfreeing a Path that directly exists
in another relations pathlist.

Another simpler option would be just don't pfree the Path if the Path
parent is not the add_path rel.

David

Show quoted text

[1] /messages/by-id/CAExHW5tUcVsBkq9qT=L5vYz4e-cwQNw=KAGJrtSyzOp3F=XacA@mail.gmail.com
[2] /messages/by-id/CAM2+6=UC1mcVtM0Y_LEMBEGHTM58HEkqHPn7vau_V_YfuZjEGg@mail.gmail.com

Richard Guo

guofenglinux@gmail.com

almost 2 years ago

In reply to: David Rowley (#2)

1 attachment(s)

Re: Properly pathify the union planner

On Fri, Nov 24, 2023 at 6:29 AM David Rowley <dgrowleyml@gmail.com> wrote:

I've attached the updated patch. This one is probably ready for
someone to test out. There will be more work to do, however.

I just started reviewing this patch and haven't looked through all the
details yet. Here are some feedbacks that came to my mind. Post them
first so that I don’t forget them after the holidays.

* I think we should update truncate_useless_pathkeys() to account for
the ordering requested by the query's set operation; otherwise we may
not get a subquery's path with the expected pathkeys. For instance,

create table t (a int, b int);
create index on t (a, b);
set enable_hashagg to off;

-- on v1 patch
explain (costs off)
(select * from t order by a) UNION (select * from t order by a);
QUERY PLAN
------------------------------------------------------------
Unique
-> Merge Append
Sort Key: t.a, t.b
-> Incremental Sort
Sort Key: t.a, t.b
Presorted Key: t.a
-> Index Only Scan using t_a_b_idx on t
-> Incremental Sort
Sort Key: t_1.a, t_1.b
Presorted Key: t_1.a
-> Index Only Scan using t_a_b_idx on t t_1
(11 rows)

-- after accounting for setop_pathkeys in truncate_useless_pathkeys()
explain (costs off)
(select * from t order by a) UNION (select * from t order by a);
QUERY PLAN
------------------------------------------------------
Unique
-> Merge Append
Sort Key: t.a, t.b
-> Index Only Scan using t_a_b_idx on t
-> Index Only Scan using t_a_b_idx on t t_1
(5 rows)

* I understand that we need to sort (full or incremental) the paths of
the subqueries to meet the ordering required for setop_pathkeys, so that
MergeAppend has something to work with. Currently in the v1 patch this
sorting is performed during the planning phase of the subqueries (in
grouping_planner).

And we want to add the subquery's cheapest_total_path as-is to allow
that path to be used in hash-based UNIONs, and we also want to add a
sorted path on top of cheapest_total_path. And then we may encounter
the issue you mentioned earlier regarding add_path() potentially freeing
the cheapest_total_path, leaving the Sort's subpath gone.

I'm thinking that maybe it'd be better to move the work of sorting the
subquery's paths to the outer query level, specifically within the
build_setop_child_paths() function, just before we stick SubqueryScanPath
on top of the subquery's paths. I think this is better because:

1. This minimizes the impact on subquery planning and reduces the
footprint within the grouping_planner() function as much as possible.

2. This can help avoid the aforementioned add_path() issue because the
two involved paths will be structured as:

cheapest_path -> subqueryscan
and
cheapest_path -> sort -> subqueryscan

If the two paths cost fuzzily the same and add_path() decides to keep
the second one due to it having better pathkeys and pfree the first one,
it would not be a problem.

BTW, I haven't looked through the part involving partial paths, but I
think we can do the same to partial paths.

* I noticed that in generate_union_paths() we use a new function
build_setop_pathkeys() to build the 'union_pathkeys'. I wonder why we
don't simply utilize make_pathkeys_for_sortclauses() since we already
have the grouplist for the setop's output columns.

To assist the discussion I've attached a diff file that includes all the
changes above.

Thanks
Richard

Attachments:

v1-0001-review-diff.patchapplication/octet-stream; name=v1-0001-review-diff.patchDownload

From f2fb47a38003e76aa06b5c6dd2afbeaedef36561 Mon Sep 17 00:00:00 2001
From: Richard Guo <guofenglinux@gmail.com>
Date: Tue, 6 Feb 2024 11:11:21 +0800
Subject: [PATCH v1] review diff

---
 src/backend/optimizer/path/pathkeys.c  | 96 +++++++++-----------------
 src/backend/optimizer/plan/planner.c   | 84 +---------------------
 src/backend/optimizer/prep/prepunion.c | 77 +++++++++++++++++++--
 src/backend/optimizer/util/pathnode.c  | 11 ---
 src/include/optimizer/paths.h          |  2 -
 5 files changed, 104 insertions(+), 166 deletions(-)

diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index 2eb144f1df..e0281ef84c 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -985,71 +985,6 @@ build_partition_pathkeys(PlannerInfo *root, RelOptInfo *partrel,
 	return retval;
 }
 
-/*
- * build_setop_pathkeys
- *	  Build and return a list of PathKeys, one for each non-junk target in
- *	  'tlist'.
- */
-List *
-build_setop_pathkeys(PlannerInfo *root, SetOperationStmt *op,
-					 Relids relids, List *tlist)
-{
-	ListCell   *lc;
-	ListCell   *sgccell = list_head(op->groupClauses);
-	List	   *retval = NIL;
-
-	foreach(lc, tlist)
-	{
-		TargetEntry *tle = lfirst_node(TargetEntry, lc);
-		SortGroupClause *sgc;
-
-		Oid			opfamily;
-		Oid			opcintype;
-		int16		strategy;
-		PathKey    *cpathkey;
-
-		if (tle->resjunk)
-			continue;
-
-		/*
-		 * XXX query_is_distinct_for() is happy to Assert this, should this do
-		 * that rather than ERROR?
-		 */
-		if (sgccell == NULL)
-			elog(ERROR, "too few group clauses");
-
-		sgc = lfirst_node(SortGroupClause, sgccell);
-
-		/* Find the operator in pg_amop --- failure shouldn't happen */
-		if (!get_ordering_op_properties(sgc->sortop,
-										&opfamily, &opcintype, &strategy))
-			elog(ERROR, "operator %u is not a valid ordering operator",
-				 sgc->eqop);
-
-		cpathkey = make_pathkey_from_sortinfo(root,
-											  tle->expr,
-											  opfamily,
-											  opcintype,
-											  exprCollation((Node *) tle->expr),
-											  false,
-											  sgc->nulls_first,
-											  0,
-											  relids,
-											  true);
-		retval = lappend(retval, cpathkey);
-		sgccell = lnext(op->groupClauses, sgccell);
-
-		/*
-		 * There's no need to look for redundant pathkeys as set operations
-		 * have no ability to have non-child constants in an EquivalenceClass.
-		 * Let's just make sure that remains true.
-		 */
-		Assert(!EC_MUST_BE_REDUNDANT(cpathkey->pk_eclass));
-	}
-
-	return retval;
-}
-
 /*
  * build_expression_pathkey
  *	  Build a pathkeys list that describes an ordering by a single expression
@@ -2251,6 +2186,34 @@ pathkeys_useful_for_grouping(PlannerInfo *root, List *pathkeys)
 	return n;
 }
 
+/*
+ * pathkeys_useful_for_setop
+ *		Count the number of pathkeys that are useful for meeting the ordering
+ *		requested by the query's set operation.
+ *
+ * Because we the have the possibility of incremental sort, a prefix list of
+ * keys is potentially useful for improving the performance of the set
+ * operation's ordering. Thus we return 0, if no valuable keys are found, or
+ * the number of leading keys shared by the list and the set operation's
+ * ordering..
+ */
+static int
+pathkeys_useful_for_setop(PlannerInfo *root, List *pathkeys)
+{
+	int			n_common_pathkeys;
+
+	if (root->setop_pathkeys == NIL)
+		return 0;				/* no special setop ordering requested */
+
+	if (pathkeys == NIL)
+		return 0;				/* unordered path */
+
+	(void) pathkeys_count_contained_in(root->setop_pathkeys, pathkeys,
+									   &n_common_pathkeys);
+
+	return n_common_pathkeys;
+}
+
 /*
  * truncate_useless_pathkeys
  *		Shorten the given pathkey list to just the useful pathkeys.
@@ -2268,6 +2231,9 @@ truncate_useless_pathkeys(PlannerInfo *root,
 	if (nuseful2 > nuseful)
 		nuseful = nuseful2;
 	nuseful2 = pathkeys_useful_for_grouping(root, pathkeys);
+	if (nuseful2 > nuseful)
+		nuseful = nuseful2;
+	nuseful2 = pathkeys_useful_for_setop(root, pathkeys);
 	if (nuseful2 > nuseful)
 		nuseful = nuseful2;
 
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 5cb140c049..47006edeae 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1769,13 +1769,6 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 	foreach(lc, current_rel->pathlist)
 	{
 		Path	   *path = (Path *) lfirst(lc);
-		Path	   *input_path;
-
-		/*
-		 * Keep record of the unmodified path so that we can still tell which
-		 * one is the cheapest_input_path.
-		 */
-		input_path = path;
 
 		/*
 		 * If there is a FOR [KEY] UPDATE/SHARE clause, add the LockRows node.
@@ -1804,82 +1797,7 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 		}
 
 		/*
-		 * Create paths to suit final sort order required for setop_pathkeys.
-		 * Here we'll sort the cheapest input path (if not sorted already) and
-		 * incremental sort any paths which are partially sorted.  We also
-		 * include the cheapest path as-is so that the set operation can be
-		 * cheaply implemented using a method which does not require the input
-		 * to be sorted.
-		 */
-		if (root->setop_pathkeys != NIL)
-		{
-			Path	   *cheapest_input_path = current_rel->cheapest_total_path;
-			bool		is_sorted;
-			int			presorted_keys;
-
-			is_sorted = pathkeys_count_contained_in(root->setop_pathkeys,
-													path->pathkeys,
-													&presorted_keys);
-
-			if (!is_sorted)
-			{
-				/*
-				 * Try at least sorting the cheapest path and also try
-				 * incrementally sorting any path which is partially sorted
-				 * already (no need to deal with paths which have presorted
-				 * keys when incremental sort is disabled unless it's the
-				 * cheapest input path).
-				 */
-				if (input_path != cheapest_input_path)
-				{
-					if (presorted_keys == 0 || !enable_incremental_sort)
-						continue;
-				}
-				else
-				{
-					/*
-					 * If this is an INSERT/UPDATE/DELETE/MERGE, add a
-					 * ModifyTable path.
-					 */
-					if (parse->commandType != CMD_SELECT)
-						path = build_final_modify_table_path(root,
-															 final_rel,
-															 path);
-
-					/*
-					 * The union planner might want to try a hash-based method
-					 * of executing the set operation, so let's provide the
-					 * cheapest input path so that it can do that as cheaply
-					 * as possible.  If the cheapest_input_path happens to be
-					 * already correctly sorted then we'll add it to final_rel
-					 * at the end of the loop.
-					 */
-					add_path(final_rel, path);
-				}
-
-				/*
-				 * We've no need to consider both a sort and incremental sort.
-				 * We'll just do a sort if there are no presorted keys and an
-				 * incremental sort when there are presorted keys.
-				 */
-				if (presorted_keys == 0 || !enable_incremental_sort)
-					path = (Path *) create_sort_path(root,
-													 final_rel,
-													 path,
-													 root->setop_pathkeys,
-													 limit_tuples);
-				else
-					path = (Path *) create_incremental_sort_path(root,
-																 final_rel,
-																 path,
-																 root->setop_pathkeys,
-																 presorted_keys,
-																 limit_tuples);
-			}
-		}
-
-		/*
-		 * If this is an INSERT/UPDATE/DELETE/MERGE, add a ModifyTable path.
+		 * If this is an INSERT/UPDATE/DELETE/MERGE, add the ModifyTable node.
 		 */
 		if (parse->commandType != CMD_SELECT)
 			path = build_final_modify_table_path(root, final_rel, path);
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index dfdf2ec0a9..b0938e0110 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -588,6 +588,74 @@ build_setop_child_paths(PlannerInfo *root, RelOptInfo *rel,
 	{
 		Path	   *subpath = (Path *) lfirst(lc);
 		List	   *pathkeys;
+		Path	   *cheapest_input_path = final_rel->cheapest_total_path;
+		List	   *setop_pathkeys = rel->subroot->setop_pathkeys;
+		bool		is_sorted;
+		int			presorted_keys;
+
+		/*
+		 * Include the cheapest path as-is so that the set operation can be
+		 * cheaply implemented using a method which does not require the input
+		 * to be sorted.
+		 */
+		if (subpath == cheapest_input_path)
+		{
+			/* Convert subpath's pathkeys to outer representation */
+			pathkeys = convert_subquery_pathkeys(root, rel, subpath->pathkeys,
+												 make_tlist_from_pathtarget(subpath->pathtarget));
+
+			/* Generate outer path using this subpath */
+			add_path(rel, (Path *) create_subqueryscan_path(root,
+															rel,
+															subpath,
+															trivial_tlist,
+															pathkeys,
+															NULL));
+		}
+
+		/*
+		 * Create paths to suit final sort order required for setop_pathkeys.
+		 * Here we'll sort the cheapest input path (if not sorted already) and
+		 * incremental sort any paths which are partially sorted.
+		 */
+		is_sorted = pathkeys_count_contained_in(setop_pathkeys,
+												subpath->pathkeys,
+												&presorted_keys);
+
+		if (!is_sorted)
+		{
+			double limittuples = rel->subroot->limit_tuples;
+
+			/*
+			 * Try at least sorting the cheapest path and also try
+			 * incrementally sorting any path which is partially sorted
+			 * already (no need to deal with paths which have presorted
+			 * keys when incremental sort is disabled unless it's the
+			 * cheapest input path).
+			 */
+			if (subpath != cheapest_input_path &&
+				(presorted_keys == 0 || !enable_incremental_sort))
+				continue;
+
+			/*
+			 * We've no need to consider both a sort and incremental sort.
+			 * We'll just do a sort if there are no presorted keys and an
+			 * incremental sort when there are presorted keys.
+			 */
+			if (presorted_keys == 0 || !enable_incremental_sort)
+				subpath = (Path *) create_sort_path(root,
+													final_rel,
+													subpath,
+													setop_pathkeys,
+													limittuples);
+			else
+				subpath = (Path *) create_incremental_sort_path(root,
+																final_rel,
+																subpath,
+																setop_pathkeys,
+																presorted_keys,
+																limittuples);
+		}
 
 		/* Convert subpath's pathkeys to outer representation */
 		pathkeys = convert_subquery_pathkeys(root, rel, subpath->pathkeys,
@@ -690,12 +758,11 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 
 	if (try_sorted)
 	{
-		/* determine the pathkeys for sorting by the whole target list */
-		union_pathkeys = build_setop_pathkeys(root,
-											  op,
-											  bms_make_singleton(0),
-											  tlist);
+		/* Identify the grouping semantics */
 		groupList = generate_setop_grouplist(op, tlist);
+
+		/* Determine the pathkeys for sorting by the whole target list */
+		union_pathkeys = make_pathkeys_for_sortclauses(root, groupList, tlist);
 	}
 
 	/*
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index e375cd52b5..64a2cd1045 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -587,22 +587,11 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
 			parent_rel->pathlist = foreach_delete_current(parent_rel->pathlist,
 														  p1);
 
-#ifdef NOT_USED
-
-			/*
-			 * XXX fixme.  When creating the final upper rel paths for queries
-			 * which have setop_pathkey set,  we'll at the cheapest path as-is
-			 * also try sorting the cheapest path. If the costs of each are
-			 * fuzzily the same then we might choose to pfree the cheapest
-			 * path.  That's bad as the sort uses that path.
-			 */
-
 			/*
 			 * Delete the data pointed-to by the deleted cell, if possible
 			 */
 			if (!IsA(old_path, IndexPath))
 				pfree(old_path);
-#endif
 		}
 		else
 		{
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 7dc3507e8d..1165e98c5b 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -222,8 +222,6 @@ extern List *build_index_pathkeys(PlannerInfo *root, IndexOptInfo *index,
 								  ScanDirection scandir);
 extern List *build_partition_pathkeys(PlannerInfo *root, RelOptInfo *partrel,
 									  ScanDirection scandir, bool *partialkeys);
-extern List *build_setop_pathkeys(PlannerInfo *root, SetOperationStmt *op,
-								  Relids relids, List *tlist);
 extern List *build_expression_pathkey(PlannerInfo *root, Expr *expr,
 									  Oid opno,
 									  Relids rel, bool create_it);
-- 
2.31.0

Andy Fan

zhihuifan1213@163.com

almost 2 years ago

In reply to: Richard Guo (#5)

1 attachment(s)

Re: Properly pathify the union planner

Hi,

* I think we should update truncate_useless_pathkeys() to account for
the ordering requested by the query's set operation;

Nice catch.

I'm thinking that maybe it'd be better to move the work of sorting the
subquery's paths to the outer query level, specifically within the
build_setop_child_paths() function, just before we stick SubqueryScanPath
on top of the subquery's paths. I think this is better because:

1. This minimizes the impact on subquery planning and reduces the
footprint within the grouping_planner() function as much as possible.

2. This can help avoid the aforementioned add_path() issue because the
two involved paths will be structured as:

cheapest_path -> subqueryscan
and
cheapest_path -> sort -> subqueryscan

If the two paths cost fuzzily the same and add_path() decides to keep
the second one due to it having better pathkeys and pfree the first one,
it would not be a problem.

This is a smart idea, it works because you create a two different
subqueryscan for the cheapest_input_path.

FWIW, I found we didn't create_sort_path during building a merge join
path, instead it just cost the sort and add it to the cost of mergejoin
path only and note this path needs a presorted data. At last during the
create_mergejoin_*plan*, it create the sort_plan really. As for the
mergeappend case, could we use the similar strategy? with this way, we
might simpliy the code to use MergeAppend node since the caller just
need to say I want to try MergeAppend with the given pathkeys without
really creating the sort by themselves.

(Have a quick glance of initial_cost_mergejoin and
create_mergejoin_plan, looks incremental sort doesn't work with mergejoin?)

To assist the discussion I've attached a diff file that includes all the
changes above.

+ */
+static int
+pathkeys_useful_for_setop(PlannerInfo *root, List *pathkeys)
+{
+	int			n_common_pathkeys;
+
+	if (root->setop_pathkeys == NIL)
+		return 0;				/* no special setop ordering requested */
+
+	if (pathkeys == NIL)
+		return 0;				/* unordered path */
+
+	(void) pathkeys_count_contained_in(root->setop_pathkeys, pathkeys,
+									   &n_common_pathkeys);
+
+	return n_common_pathkeys;
+}

The two if-clauses looks unnecessary, it should be handled by
pathkeys_count_contained_in already. The same issue exists in
pathkeys_useful_for_ordering as well. Attached patch fix it in master.

--
Best Regards
Andy Fan

Attachments:

v1-0001-Remove-the-unnecessary-checks-for-pathkey-routine.patchtext/x-diffDownload

From f562dd8b80bd0b7c2d66ee9633434a1e5be82e04 Mon Sep 17 00:00:00 2001
From: "yizhi.fzh" <yizhi.fzh@alibaba-inc.com>
Date: Wed, 7 Feb 2024 07:00:33 +0800
Subject: [PATCH v1 1/1] Remove the unnecessary checks for pathkey routines

both pathkeys_count_contained_in and foreach can handle the NIL input,
so no need the extra checks.
---
 src/backend/optimizer/path/pathkeys.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index 82ff31273b..98334e5c52 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -2126,12 +2126,6 @@ pathkeys_useful_for_ordering(PlannerInfo *root, List *pathkeys)
 {
 	int			n_common_pathkeys;
 
-	if (root->query_pathkeys == NIL)
-		return 0;				/* no special ordering requested */
-
-	if (pathkeys == NIL)
-		return 0;				/* unordered path */
-
 	(void) pathkeys_count_contained_in(root->query_pathkeys, pathkeys,
 									   &n_common_pathkeys);
 
@@ -2167,10 +2161,6 @@ pathkeys_useful_for_grouping(PlannerInfo *root, List *pathkeys)
 	if (root->group_pathkeys == NIL)
 		return 0;
 
-	/* unordered path */
-	if (pathkeys == NIL)
-		return 0;
-
 	/* walk the pathkeys and search for matching group key */
 	foreach(key, pathkeys)
 	{
-- 
2.34.1

David Rowley

dgrowleyml@gmail.com

almost 2 years ago

In reply to: Richard Guo (#5)

1 attachment(s)

Re: Properly pathify the union planner

On Tue, 6 Feb 2024 at 22:05, Richard Guo <guofenglinux@gmail.com> wrote:

I'm thinking that maybe it'd be better to move the work of sorting the
subquery's paths to the outer query level, specifically within the
build_setop_child_paths() function, just before we stick SubqueryScanPath
on top of the subquery's paths. I think this is better because:

1. This minimizes the impact on subquery planning and reduces the
footprint within the grouping_planner() function as much as possible.

2. This can help avoid the aforementioned add_path() issue because the
two involved paths will be structured as:

Yes, this is a good idea. I agree with both of your points.

I've taken your suggested changes with minor fixups and expanded on it
to do the partial paths too. I've also removed almost all of the
changes to planner.c.

I fixed a bug where I was overwriting the union child's
TargetEntry.ressortgroupref without consideration that it might be set
for some other purpose in the subquery. I wrote
generate_setop_child_grouplist() to handle this which is almost like
generate_setop_grouplist() except it calls assignSortGroupRef() to
figure out the next free tleSortGroupRef, (or reuse the existing one
if the TargetEntry already has one set).

Earlier, I pushed a small comment change to pathnode.c in order to
shrink this patch down a little. It was also a chance that could be
made in isolation of this work.

v2 attached.

David

Attachments:

better_union_planner_v2.patchtext/plain; charset=US-ASCII; name=better_union_planner_v2.patchDownload

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index b5a38aeb21..62b7103d0c 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -11454,6 +11454,10 @@ DROP INDEX base_tbl1_idx;
 DROP INDEX base_tbl2_idx;
 DROP INDEX async_p3_idx;
 -- UNION queries
+SET enable_sort TO off;
+SET enable_incremental_sort TO off;
+-- Adjust fdw_startup_cost so that we get an unordered path in the Append.
+ALTER SERVER loopback2 OPTIONS (ADD fdw_startup_cost '0.00');
 EXPLAIN (VERBOSE, COSTS OFF)
 INSERT INTO result_tbl
 (SELECT a, b, 'AAA' || c FROM async_p1 ORDER BY a LIMIT 10)
@@ -11535,6 +11539,9 @@ SELECT * FROM result_tbl ORDER BY a;
 (12 rows)
 
 DELETE FROM result_tbl;
+RESET enable_incremental_sort;
+RESET enable_sort;
+ALTER SERVER loopback2 OPTIONS (DROP fdw_startup_cost);
 -- Disable async execution if we use gating Result nodes for pseudoconstant
 -- quals
 EXPLAIN (VERBOSE, COSTS OFF)
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index f410c3db4e..0ed9c5d61f 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3857,6 +3857,11 @@ DROP INDEX base_tbl2_idx;
 DROP INDEX async_p3_idx;
 
 -- UNION queries
+SET enable_sort TO off;
+SET enable_incremental_sort TO off;
+-- Adjust fdw_startup_cost so that we get an unordered path in the Append.
+ALTER SERVER loopback2 OPTIONS (ADD fdw_startup_cost '0.00');
+
 EXPLAIN (VERBOSE, COSTS OFF)
 INSERT INTO result_tbl
 (SELECT a, b, 'AAA' || c FROM async_p1 ORDER BY a LIMIT 10)
@@ -3883,6 +3888,10 @@ UNION ALL
 SELECT * FROM result_tbl ORDER BY a;
 DELETE FROM result_tbl;
 
+RESET enable_incremental_sort;
+RESET enable_sort;
+ALTER SERVER loopback2 OPTIONS (DROP fdw_startup_cost);
+
 -- Disable async execution if we use gating Result nodes for pseudoconstant
 -- quals
 EXPLAIN (VERBOSE, COSTS OFF)
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 4bd60a09c6..c036911309 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -2867,6 +2867,58 @@ add_child_join_rel_equivalences(PlannerInfo *root,
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * add_setop_child_rel_equivalences
+ *		Add equivalence members for each non-resjunk target in 'child_tlist'
+ *		to the EquivalenceClass in the corresponding setop_pathkey's pk_class.
+ *
+ * 'root' is the PlannerInfo belonging to the top-level set operation.
+ * 'child_relids' are the Relids of the child relation we're adding
+ * EquivalenceMembers for.
+ * 'child_tlist' is the target list for the setop child relation.  The target
+ * list expressions are what we add as EquivalenceMembers.
+ * 'setop_pathkeys' is a list of PathKeys which must contain an entry for each
+ * non-resjunk target in 'child_tlist'.
+ */
+void
+add_setop_child_rel_equivalences(PlannerInfo *root, Relids child_relids,
+								 List *child_tlist, List *setop_pathkeys)
+{
+	ListCell   *lc;
+	ListCell   *lc2 = list_head(setop_pathkeys);
+
+	foreach(lc, child_tlist)
+	{
+		TargetEntry *tle = lfirst_node(TargetEntry, lc);
+		EquivalenceMember *parent_em;
+		PathKey    *pk;
+
+		if (tle->resjunk)
+			continue;
+
+		if (lc2 == NULL)
+			elog(ERROR, "too few pathkeys for set operation");
+
+		pk = lfirst_node(PathKey, lc2);
+		parent_em = linitial(pk->pk_eclass->ec_members);
+
+		/*
+		 * We can safely pass the parent member as the first member in the
+		 * ec_members list as this is added first in generate_union_paths,
+		 * likewise, the JoinDomain can be that of the initial member of the
+		 * Pathkey's EquivalenceClass.
+		 */
+		add_eq_member(pk->pk_eclass,
+					  tle->expr,
+					  child_relids,
+					  parent_em->em_jdomain,
+					  parent_em,
+					  exprType((Node *) tle->expr));
+
+		lc2 = lnext(setop_pathkeys, lc2);
+	}
+}
+
 
 /*
  * generate_implied_equalities_for_column
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index bcc1b7a9eb..13f2eb5bed 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -2203,6 +2203,22 @@ pathkeys_useful_for_grouping(PlannerInfo *root, List *pathkeys)
 	return n;
 }
 
+/*
+ * pathkeys_useful_for_setop
+ *		Count the number of leading common pathkeys root's 'setop_pathkeys' in
+ *		'pathkeys'.
+ */
+static int
+pathkeys_useful_for_setop(PlannerInfo *root, List *pathkeys)
+{
+	int			n_common_pathkeys;
+
+	(void) pathkeys_count_contained_in(root->setop_pathkeys, pathkeys,
+									   &n_common_pathkeys);
+
+	return n_common_pathkeys;
+}
+
 /*
  * truncate_useless_pathkeys
  *		Shorten the given pathkey list to just the useful pathkeys.
@@ -2220,6 +2236,9 @@ truncate_useless_pathkeys(PlannerInfo *root,
 	if (nuseful2 > nuseful)
 		nuseful = nuseful2;
 	nuseful2 = pathkeys_useful_for_grouping(root, pathkeys);
+	if (nuseful2 > nuseful)
+		nuseful = nuseful2;
+	nuseful2 = pathkeys_useful_for_setop(root, pathkeys);
 	if (nuseful2 > nuseful)
 		nuseful = nuseful2;
 
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index be4e182869..33627dd4ea 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -58,6 +58,7 @@
 #include "optimizer/tlist.h"
 #include "parser/analyze.h"
 #include "parser/parse_agg.h"
+#include "parser/parse_clause.h"
 #include "parser/parse_relation.h"
 #include "parser/parsetree.h"
 #include "partitioning/partdesc.h"
@@ -126,6 +127,8 @@ typedef struct
 {
 	List	   *activeWindows;	/* active windows, if any */
 	grouping_sets_data *gset_data;	/* grouping sets data, if any */
+	SetOperationStmt *setop;	/* parent set operation or NULL if not a
+								 * subquery belonging to a set operation */
 } standard_qp_extra;
 
 /* Local functions */
@@ -256,6 +259,8 @@ static bool group_by_has_partkey(RelOptInfo *input_rel,
 								 List *targetList,
 								 List *groupClause);
 static int	common_prefix_cmp(const void *a, const void *b);
+static List *generate_setop_child_grouplist(SetOperationStmt *op,
+											List *targetlist);
 
 
 /*****************************************************************************
@@ -1507,6 +1512,18 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 		qp_extra.activeWindows = activeWindows;
 		qp_extra.gset_data = gset_data;
 
+		/*
+		 * Check if we're a subquery for a set operation.  If we are, store
+		 * the SetOperationStmt in qp_extra.
+		 */
+		if (root->parent_root != NULL &&
+			root->parent_root->parse->setOperations != NULL &&
+			IsA(root->parent_root->parse->setOperations, SetOperationStmt))
+			qp_extra.setop =
+				(SetOperationStmt *) root->parent_root->parse->setOperations;
+		else
+			qp_extra.setop = NULL;
+
 		/*
 		 * Generate the best unsorted and presorted paths for the scan/join
 		 * portion of this Query, ie the processing represented by the
@@ -3440,6 +3457,27 @@ standard_qp_callback(PlannerInfo *root, void *extra)
 									  parse->sortClause,
 									  tlist);
 
+	/* setting setop_pathkeys might be useful to the union planner */
+	if (qp_extra->setop != NULL &&
+		set_operation_ordered_results_useful(qp_extra->setop))
+	{
+		List	   *groupClauses;
+		bool		sortable;
+
+		groupClauses = generate_setop_child_grouplist(qp_extra->setop, tlist);
+
+		root->setop_pathkeys =
+			make_pathkeys_for_sortclauses_extended(root,
+												   &groupClauses,
+												   tlist,
+												   false,
+												   &sortable);
+		if (!sortable)
+			root->setop_pathkeys = NIL;
+	}
+	else
+		root->setop_pathkeys = NIL;
+
 	/*
 	 * Figure out whether we want a sorted result from query_planner.
 	 *
@@ -3449,7 +3487,9 @@ standard_qp_callback(PlannerInfo *root, void *extra)
 	 * sortable DISTINCT clause that's more rigorous than the ORDER BY clause,
 	 * we try to produce output that's sufficiently well sorted for the
 	 * DISTINCT.  Otherwise, if there is an ORDER BY clause, we want to sort
-	 * by the ORDER BY clause.
+	 * by the ORDER BY clause.  Otherwise, if we're a subquery being planned
+	 * for a set operation which can benefit from presorted results and have a
+	 * sortable targetlist, we want to sort by the target list.
 	 *
 	 * Note: if we have both ORDER BY and GROUP BY, and ORDER BY is a superset
 	 * of GROUP BY, it would be tempting to request sort by ORDER BY --- but
@@ -3467,6 +3507,8 @@ standard_qp_callback(PlannerInfo *root, void *extra)
 		root->query_pathkeys = root->distinct_pathkeys;
 	else if (root->sort_pathkeys)
 		root->query_pathkeys = root->sort_pathkeys;
+	else if (root->setop_pathkeys != NIL)
+		root->query_pathkeys = root->setop_pathkeys;
 	else
 		root->query_pathkeys = NIL;
 }
@@ -7862,3 +7904,47 @@ group_by_has_partkey(RelOptInfo *input_rel,
 
 	return true;
 }
+
+/*
+ * generate_setop_child_grouplist
+ *		Build a SortGroupClause list defining the sort/grouping properties
+ *		of the child of a set operation.
+ *
+ * This is similar to generate_setop_grouplist() but differs as the setop
+ * child query's targetlist entries may already have a tleSortGroupRef
+ * assigned for other purposes, such as GROUP BYs.  Here we keep the
+ * SortGroupClause list in the same order as 'op' groupClauses and just adjust
+ * the tleSortGroupRef to reference the TargetEntry's 'ressortgroupref'.
+ */
+static List *
+generate_setop_child_grouplist(SetOperationStmt *op, List *targetlist)
+{
+	List	   *grouplist = copyObject(op->groupClauses);
+	ListCell   *lg;
+	ListCell   *lt;
+
+	lg = list_head(grouplist);
+	foreach(lt, targetlist)
+	{
+		TargetEntry *tle = (TargetEntry *) lfirst(lt);
+		SortGroupClause *sgc;
+
+		if (tle->resjunk)
+		{
+			/* resjunk columns should not have sortgrouprefs */
+			Assert(tle->ressortgroupref == 0);
+			continue;			/* ignore resjunk columns */
+		}
+
+		/* non-resjunk columns should have grouping clauses */
+		Assert(lg != NULL);
+		sgc = (SortGroupClause *) lfirst(lg);
+		lg = lnext(grouplist, lg);
+
+		/* assign a tleSortGroupRef, or reuse the existing one */
+		sgc->tleSortGroupRef = assignSortGroupRef(tle, targetlist);
+		tle->ressortgroupref = sgc->tleSortGroupRef;
+	}
+	Assert(lg == NULL);
+	return grouplist;
+}
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index c939b5a45f..7af99ca71c 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -51,11 +51,15 @@ static RelOptInfo *recurse_set_operations(Node *setOp, PlannerInfo *root,
 										  bool junkOK,
 										  int flag, List *refnames_tlist,
 										  List **pTargetList,
+										  bool *istrivial_tlist,
 										  double *pNumGroups);
 static RelOptInfo *generate_recursion_path(SetOperationStmt *setOp,
 										   PlannerInfo *root,
 										   List *refnames_tlist,
 										   List **pTargetList);
+static void build_setop_child_paths(PlannerInfo *root, RelOptInfo *rel,
+									bool trivial_tlist, List *child_tlist,
+									List *interesting_pathkeys);
 static RelOptInfo *generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 										List *refnames_tlist,
 										List **pTargetList);
@@ -65,9 +69,8 @@ static RelOptInfo *generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *ro
 static List *plan_union_children(PlannerInfo *root,
 								 SetOperationStmt *top_union,
 								 List *refnames_tlist,
-								 List **tlist_list);
-static Path *make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
-							   PlannerInfo *root);
+								 List **tlist_list,
+								 List **istrivial_tlist);
 static void postprocess_setop_rel(PlannerInfo *root, RelOptInfo *rel);
 static bool choose_hashed_setop(PlannerInfo *root, List *groupClauses,
 								Path *input_path,
@@ -160,6 +163,8 @@ plan_set_operations(PlannerInfo *root)
 	}
 	else
 	{
+		bool		trivial_tlist;
+
 		/*
 		 * Recurse on setOperations tree to generate paths for set ops. The
 		 * final output paths should have just the column types shown as the
@@ -171,6 +176,7 @@ plan_set_operations(PlannerInfo *root)
 										   true, -1,
 										   leftmostQuery->targetList,
 										   &top_tlist,
+										   &trivial_tlist,
 										   NULL);
 	}
 
@@ -180,6 +186,33 @@ plan_set_operations(PlannerInfo *root)
 	return setop_rel;
 }
 
+/*
+ * set_operation_ordered_results_useful
+ *		Return true if the given SetOperationStmt can be executed by utilizing
+ *		paths that provide sorted input according to the setop's targetlist.
+ *		Returns false when sorted paths are not any more useful then unsorted
+ *		ones.
+ */
+bool
+set_operation_ordered_results_useful(SetOperationStmt *setop)
+{
+	/*
+	 * Paths sorted by the targetlist are useful for UNION as we can opt to
+	 * MergeAppend the sorted paths then Unique them.  Ordered paths are no
+	 * more useful than unordered ones for UNION ALL.
+	 */
+	if (!setop->all && setop->op == SETOP_UNION)
+		return true;
+
+	/*
+	 * EXCEPT / EXCEPT ALL / INTERSECT / INTERSECT ALL cannot yet utilize
+	 * correctly sorted input paths.
+	 */
+
+	/* sorted paths are not deemed of any additional use for this setop */
+	return false;
+}
+
 /*
  * recurse_set_operations
  *	  Recursively handle one step in a tree of set operations
@@ -200,6 +233,10 @@ plan_set_operations(PlannerInfo *root)
  * the logic in this file depends on flag columns being marked resjunk.
  * Pending a redesign of how that works, this is the easy way out.
  *
+ * 'istrivial_tlist' is an output parameter and is set to true when the
+ * pTargetList is deemed to be a trivial target list as determined by
+ * generate_setop_tlist().
+ *
  * We don't have to care about typmods here: the only allowed difference
  * between set-op input and output typmods is input is a specific typmod
  * and output is -1, and that does not require a coercion.
@@ -210,9 +247,12 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 					   bool junkOK,
 					   int flag, List *refnames_tlist,
 					   List **pTargetList,
+					   bool *istrivial_tlist,
 					   double *pNumGroups)
 {
-	RelOptInfo *rel = NULL;		/* keep compiler quiet */
+	RelOptInfo *rel;
+
+	*istrivial_tlist = true;	/* for now */
 
 	/* Guard against stack overflow due to overly complex setop nests */
 	check_stack_depth();
@@ -224,8 +264,6 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		Query	   *subquery = rte->subquery;
 		PlannerInfo *subroot;
 		RelOptInfo *final_rel;
-		Path	   *subpath;
-		Path	   *path;
 		List	   *tlist;
 		bool		trivial_tlist;
 
@@ -262,61 +300,8 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 
 		/* Return the fully-fledged tlist to caller, too */
 		*pTargetList = tlist;
-
-		/*
-		 * Mark rel with estimated output rows, width, etc.  Note that we have
-		 * to do this before generating outer-query paths, else
-		 * cost_subqueryscan is not happy.
-		 */
-		set_subquery_size_estimates(root, rel);
-
-		/*
-		 * Since we may want to add a partial path to this relation, we must
-		 * set its consider_parallel flag correctly.
-		 */
+		*istrivial_tlist = trivial_tlist;
 		final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
-		rel->consider_parallel = final_rel->consider_parallel;
-
-		/*
-		 * For the moment, we consider only a single Path for the subquery.
-		 * This should change soon (make it look more like
-		 * set_subquery_pathlist).
-		 */
-		subpath = get_cheapest_fractional_path(final_rel,
-											   root->tuple_fraction);
-
-		/*
-		 * Stick a SubqueryScanPath atop that.
-		 *
-		 * We don't bother to determine the subquery's output ordering since
-		 * it won't be reflected in the set-op result anyhow; so just label
-		 * the SubqueryScanPath with nil pathkeys.  (XXX that should change
-		 * soon too, likely.)
-		 */
-		path = (Path *) create_subqueryscan_path(root, rel, subpath,
-												 trivial_tlist,
-												 NIL, NULL);
-
-		add_path(rel, path);
-
-		/*
-		 * If we have a partial path for the child relation, we can use that
-		 * to build a partial path for this relation.  But there's no point in
-		 * considering any path but the cheapest.
-		 */
-		if (rel->consider_parallel && bms_is_empty(rel->lateral_relids) &&
-			final_rel->partial_pathlist != NIL)
-		{
-			Path	   *partial_subpath;
-			Path	   *partial_path;
-
-			partial_subpath = linitial(final_rel->partial_pathlist);
-			partial_path = (Path *)
-				create_subqueryscan_path(root, rel, partial_subpath,
-										 trivial_tlist,
-										 NIL, NULL);
-			add_partial_path(rel, partial_path);
-		}
 
 		/*
 		 * Estimate number of groups if caller wants it.  If the subquery used
@@ -341,11 +326,11 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 			if (subquery->groupClause || subquery->groupingSets ||
 				subquery->distinctClause ||
 				subroot->hasHavingQual || subquery->hasAggs)
-				*pNumGroups = subpath->rows;
+				*pNumGroups = final_rel->rows;
 			else
 				*pNumGroups = estimate_num_groups(subroot,
 												  get_tlist_exprs(subroot->parse->targetList, false),
-												  subpath->rows,
+												  final_rel->rows,
 												  NULL,
 												  NULL);
 		}
@@ -394,6 +379,7 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 												*pTargetList,
 												refnames_tlist,
 												&trivial_tlist);
+			*istrivial_tlist = trivial_tlist;
 			target = create_pathtarget(root, *pTargetList);
 
 			/* Apply projection to each path */
@@ -424,16 +410,16 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 				lfirst(lc) = path;
 			}
 		}
+		postprocess_setop_rel(root, rel);
 	}
 	else
 	{
 		elog(ERROR, "unrecognized node type: %d",
 			 (int) nodeTag(setOp));
 		*pTargetList = NIL;
+		rel = NULL;				/* keep compiler quiet */
 	}
 
-	postprocess_setop_rel(root, rel);
-
 	return rel;
 }
 
@@ -452,7 +438,9 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 	Path	   *lpath;
 	Path	   *rpath;
 	List	   *lpath_tlist;
+	bool		lpath_trivial_tlist;
 	List	   *rpath_tlist;
+	bool		rpath_trivial_tlist;
 	List	   *tlist;
 	List	   *groupList;
 	double		dNumGroups;
@@ -472,7 +460,10 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 								  false, -1,
 								  refnames_tlist,
 								  &lpath_tlist,
-								  NULL);
+								  &lpath_trivial_tlist, NULL);
+	if (lrel->rtekind == RTE_SUBQUERY)
+		build_setop_child_paths(root, lrel, lpath_trivial_tlist, lpath_tlist,
+								NIL);
 	lpath = lrel->cheapest_total_path;
 	/* The right path will want to look at the left one ... */
 	root->non_recursive_path = lpath;
@@ -481,7 +472,11 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 								  false, -1,
 								  refnames_tlist,
 								  &rpath_tlist,
+								  &rpath_trivial_tlist,
 								  NULL);
+	if (rrel->rtekind == RTE_SUBQUERY)
+		build_setop_child_paths(root, rrel, rpath_trivial_tlist, rpath_tlist,
+								NIL);
 	rpath = rrel->cheapest_total_path;
 	root->non_recursive_path = NULL;
 
@@ -543,6 +538,256 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 	return result_rel;
 }
 
+/*
+ * build_setop_child_paths
+ *		Build paths for the set op child relation denoted by 'rel'.
+ *
+ * If 'interesting_pathkeys' is non-NIL then also include paths that suit
+ * these pathkeys, sorting any unsorted paths as required.
+ */
+static void
+build_setop_child_paths(PlannerInfo *root, RelOptInfo *rel,
+						bool trivial_tlist, List *child_tlist,
+						List *interesting_pathkeys)
+{
+	RelOptInfo *final_rel;
+	List	   *setop_pathkeys = rel->subroot->setop_pathkeys;
+	ListCell   *lc;
+
+	/* it can't be a set op child rel if it's not a subquery */
+	Assert(rel->rtekind == RTE_SUBQUERY);
+
+	/* when sorting is needed, add child rel equivalences */
+	if (interesting_pathkeys != NIL)
+		add_setop_child_rel_equivalences(root,
+										 rel->relids,
+										 child_tlist,
+										 interesting_pathkeys);
+
+	/*
+	 * Mark rel with estimated output rows, width, etc.  Note that we have to
+	 * do this before generating outer-query paths, else cost_subqueryscan is
+	 * not happy.
+	 */
+	set_subquery_size_estimates(root, rel);
+
+	/*
+	 * Since we may want to add a partial path to this relation, we must set
+	 * its consider_parallel flag correctly.
+	 */
+	final_rel = fetch_upper_rel(rel->subroot, UPPERREL_FINAL, NULL);
+	rel->consider_parallel = final_rel->consider_parallel;
+
+	/* Generate subquery scan paths for any interesting path in final_rel */
+	foreach(lc, final_rel->pathlist)
+	{
+		Path	   *subpath = (Path *) lfirst(lc);
+		List	   *pathkeys;
+		Path	   *cheapest_input_path = final_rel->cheapest_total_path;
+		bool		is_sorted;
+		int			presorted_keys;
+
+		/*
+		 * Include the cheapest path as-is so that the set operation can be
+		 * cheaply implemented using a method which does not require the input
+		 * to be sorted.
+		 */
+		if (subpath == cheapest_input_path)
+		{
+			/* Convert subpath's pathkeys to outer representation */
+			pathkeys = convert_subquery_pathkeys(root, rel, subpath->pathkeys,
+												 make_tlist_from_pathtarget(subpath->pathtarget));
+
+			/* Generate outer path using this subpath */
+			add_path(rel, (Path *) create_subqueryscan_path(root,
+															rel,
+															subpath,
+															trivial_tlist,
+															pathkeys,
+															NULL));
+		}
+
+		/* skip dealing with sorted paths if the setop doesn't need them */
+		if (interesting_pathkeys == NIL)
+			continue;
+
+		/*
+		 * Create paths to suit final sort order required for setop_pathkeys.
+		 * Here we'll sort the cheapest input path (if not sorted already) and
+		 * incremental sort any paths which are partially sorted.
+		 */
+		is_sorted = pathkeys_count_contained_in(setop_pathkeys,
+												subpath->pathkeys,
+												&presorted_keys);
+
+		if (!is_sorted)
+		{
+			double		limittuples = rel->subroot->limit_tuples;
+
+			/*
+			 * Try at least sorting the cheapest path and also try
+			 * incrementally sorting any path which is partially sorted
+			 * already (no need to deal with paths which have presorted keys
+			 * when incremental sort is disabled unless it's the cheapest
+			 * input path).
+			 */
+			if (subpath != cheapest_input_path &&
+				(presorted_keys == 0 || !enable_incremental_sort))
+				continue;
+
+			/*
+			 * We've no need to consider both a sort and incremental sort.
+			 * We'll just do a sort if there are no presorted keys and an
+			 * incremental sort when there are presorted keys.
+			 */
+			if (presorted_keys == 0 || !enable_incremental_sort)
+				subpath = (Path *) create_sort_path(rel->subroot,
+													final_rel,
+													subpath,
+													setop_pathkeys,
+													limittuples);
+			else
+				subpath = (Path *) create_incremental_sort_path(rel->subroot,
+																final_rel,
+																subpath,
+																setop_pathkeys,
+																presorted_keys,
+																limittuples);
+		}
+
+		/* Add the sorted path unless the cheapest path is already sorted. */
+		if (subpath != cheapest_input_path)
+		{
+			/* Convert subpath's pathkeys to outer representation */
+			pathkeys = convert_subquery_pathkeys(root, rel, subpath->pathkeys,
+												 make_tlist_from_pathtarget(subpath->pathtarget));
+
+			/* Generate outer path using this subpath */
+			add_path(rel, (Path *) create_subqueryscan_path(root,
+															rel,
+															subpath,
+															trivial_tlist,
+															pathkeys,
+															NULL));
+		}
+	}
+
+	/* if consider_parallel is false, there should be no partial paths */
+	Assert(final_rel->consider_parallel ||
+		   final_rel->partial_pathlist == NIL);
+
+	/* if rel allows parallelism, do same for partial paths */
+	if (rel->consider_parallel && bms_is_empty(rel->lateral_relids) &&
+		final_rel->partial_pathlist != NIL)
+	{
+		Path	   *cheapest_partial_path;
+		bool		is_sorted;
+		int			presorted_keys;
+
+		cheapest_partial_path = linitial(final_rel->partial_pathlist);
+
+		/* generate interesting subquery paths for each partial path */
+		foreach(lc, final_rel->partial_pathlist)
+		{
+			Path	   *subpath = (Path *) lfirst(lc);
+			List	   *pathkeys;
+
+			/*
+			 * Include the cheapest partial path as-is so that the set
+			 * operation can be cheaply implemented using a method which does
+			 * not require the input to be sorted.
+			 */
+			if (subpath == cheapest_partial_path)
+			{
+				/* convert subpath's pathkeys to outer representation */
+				pathkeys = convert_subquery_pathkeys(root, rel,
+													 subpath->pathkeys,
+													 make_tlist_from_pathtarget(subpath->pathtarget));
+
+				/* generate outer path using this subpath */
+				add_partial_path(rel,
+								 (Path *) create_subqueryscan_path(root,
+																   rel,
+																   subpath,
+																   trivial_tlist,
+																   pathkeys,
+																   NULL));
+			}
+
+			/* skip dealing with sorted paths if the setop doesn't need them */
+			if (interesting_pathkeys == NIL)
+				continue;
+
+
+			/*
+			 * As above, create paths to suit final sort order required for
+			 * setop_pathkeys.  We sort the cheapest partial path (if not
+			 * sorted already) and incremental sort any paths which are
+			 * partially sorted.
+			 */
+			is_sorted = pathkeys_count_contained_in(setop_pathkeys,
+													subpath->pathkeys,
+													&presorted_keys);
+
+			if (!is_sorted)
+			{
+				double		limittuples = rel->subroot->limit_tuples;
+
+				/*
+				 * Try at least sorting the cheapest partial path and also try
+				 * incrementally sorting any path which is partially sorted
+				 * already (no need to deal with paths which have presorted
+				 * keys when incremental sort is disabled unless it's the
+				 * cheapest input path).
+				 */
+				if (subpath != cheapest_partial_path &&
+					(presorted_keys == 0 || !enable_incremental_sort))
+					continue;
+
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					subpath = (Path *) create_sort_path(rel->subroot,
+														final_rel,
+														subpath,
+														setop_pathkeys,
+														limittuples);
+				else
+					subpath = (Path *) create_incremental_sort_path(rel->subroot,
+																	final_rel,
+																	subpath,
+																	setop_pathkeys,
+																	presorted_keys,
+																	limittuples);
+			}
+
+			/*
+			 * Add the sorted path unless the cheapest path is already sorted.
+			 */
+			if (subpath != cheapest_partial_path)
+			{
+				/* Convert subpath's pathkeys to outer representation */
+				pathkeys = convert_subquery_pathkeys(root, rel, subpath->pathkeys,
+													 make_tlist_from_pathtarget(subpath->pathtarget));
+
+				/* Generate outer path using this subpath */
+				add_partial_path(rel,
+								 (Path *) create_subqueryscan_path(root,
+																   rel,
+																   subpath,
+																   trivial_tlist,
+																   pathkeys,
+																   NULL));
+			}
+		}
+	}
+
+	postprocess_setop_rel(root, rel);
+}
+
 /*
  * Generate paths for a UNION or UNION ALL node
  */
@@ -553,30 +798,23 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 {
 	Relids		relids = NULL;
 	RelOptInfo *result_rel;
-	double		save_fraction = root->tuple_fraction;
 	ListCell   *lc;
-	List	   *pathlist = NIL;
+	ListCell   *lc2;
+	ListCell   *lc3;
+	List	   *cheapest_pathlist = NIL;
+	List	   *ordered_pathlist = NIL;
 	List	   *partial_pathlist = NIL;
 	bool		partial_paths_valid = true;
 	bool		consider_parallel = true;
 	List	   *rellist;
 	List	   *tlist_list;
+	List	   *trivial_tlist_list;
 	List	   *tlist;
-	Path	   *path;
-
-	/*
-	 * If plain UNION, tell children to fetch all tuples.
-	 *
-	 * Note: in UNION ALL, we pass the top-level tuple_fraction unmodified to
-	 * each arm of the UNION ALL.  One could make a case for reducing the
-	 * tuple fraction for later arms (discounting by the expected size of the
-	 * earlier arms' results) but it seems not worth the trouble. The normal
-	 * case where tuple_fraction isn't already zero is a LIMIT at top level,
-	 * and passing it down as-is is usually enough to get the desired result
-	 * of preferring fast-start plans.
-	 */
-	if (!op->all)
-		root->tuple_fraction = 0.0;
+	List	   *groupList = NIL;
+	Path	   *apath;
+	Path	   *gpath = NULL;
+	bool		try_sorted;
+	List	   *union_pathkeys = NIL;
 
 	/*
 	 * If any of my children are identical UNION nodes (same op, all-flag, and
@@ -584,7 +822,11 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	 * only one Append and unique-ification for the lot.  Recurse to find such
 	 * nodes and compute their children's paths.
 	 */
-	rellist = plan_union_children(root, op, refnames_tlist, &tlist_list);
+	rellist = plan_union_children(root,
+								  op,
+								  refnames_tlist,
+								  &tlist_list,
+								  &trivial_tlist_list);
 
 	/*
 	 * Generate tlist for Append plan node.
@@ -595,15 +837,66 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	 */
 	tlist = generate_append_tlist(op->colTypes, op->colCollations, false,
 								  tlist_list, refnames_tlist);
-
 	*pTargetList = tlist;
 
+	/* For for UNIONs (not UNION ALL), try sorting, if sorting is possible */
+	try_sorted = !op->all && grouping_is_sortable(op->groupClauses);
+
+	if (try_sorted)
+	{
+		/* Identify the grouping semantics */
+		groupList = generate_setop_grouplist(op, tlist);
+
+		/* Determine the pathkeys for sorting by the whole target list */
+		union_pathkeys = make_pathkeys_for_sortclauses(root, groupList, tlist);
+	}
+
+	/*
+	 * Now that we've got the append target list, we can build the union child
+	 * paths.
+	 */
+	forthree(lc, rellist, lc2, trivial_tlist_list, lc3, tlist_list)
+	{
+		RelOptInfo *rel = lfirst(lc);
+		bool		trivial_tlist = lfirst_int(lc2);
+		List	   *child_tlist = lfirst_node(List, lc3);
+
+		/* only build paths for the union children */
+		if (rel->rtekind == RTE_SUBQUERY)
+			build_setop_child_paths(root, rel, trivial_tlist, child_tlist,
+									union_pathkeys);
+	}
+
 	/* Build path lists and relid set. */
 	foreach(lc, rellist)
 	{
 		RelOptInfo *rel = lfirst(lc);
+		Path	   *ordered_path;
 
-		pathlist = lappend(pathlist, rel->cheapest_total_path);
+		cheapest_pathlist = lappend(cheapest_pathlist,
+									rel->cheapest_total_path);
+
+		if (try_sorted)
+		{
+			ordered_path = get_cheapest_path_for_pathkeys(rel->pathlist,
+														  union_pathkeys,
+														  NULL,
+														  TOTAL_COST,
+														  false);
+
+			if (ordered_path != NULL)
+				ordered_pathlist = lappend(ordered_pathlist, ordered_path);
+			else
+			{
+				/*
+				 * If we can't find a sorted path, just give up trying to
+				 * generate a list of correctly sorted child paths.  This can
+				 * happen when type coercion was added to the targetlist due
+				 * to mismatching types from the union children.
+				 */
+				try_sorted = false;
+			}
+		}
 
 		if (consider_parallel)
 		{
@@ -626,28 +919,21 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	result_rel = fetch_upper_rel(root, UPPERREL_SETOP, relids);
 	result_rel->reltarget = create_pathtarget(root, tlist);
 	result_rel->consider_parallel = consider_parallel;
+	result_rel->consider_startup = (root->tuple_fraction > 0);
 
 	/*
-	 * Append the child results together.
+	 * Append the child results together using the cheapest paths from each
+	 * union child.
 	 */
-	path = (Path *) create_append_path(root, result_rel, pathlist, NIL,
-									   NIL, NULL, 0, false, -1);
-
-	/*
-	 * For UNION ALL, we just need the Append path.  For UNION, need to add
-	 * node(s) to remove duplicates.
-	 */
-	if (!op->all)
-		path = make_union_unique(op, path, tlist, root);
-
-	add_path(result_rel, path);
+	apath = (Path *) create_append_path(root, result_rel, cheapest_pathlist,
+										NIL, NIL, NULL, 0, false, -1);
 
 	/*
 	 * Estimate number of groups.  For now we just assume the output is unique
 	 * --- this is certainly true for the UNION case, and we want worst-case
 	 * estimates anyway.
 	 */
-	result_rel->rows = path->rows;
+	result_rel->rows = apath->rows;
 
 	/*
 	 * Now consider doing the same thing using the partial paths plus Append
@@ -655,7 +941,7 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	 */
 	if (partial_paths_valid)
 	{
-		Path	   *ppath;
+		Path	   *papath;
 		int			parallel_workers = 0;
 
 		/* Find the highest number of workers requested for any subpath. */
@@ -684,21 +970,136 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 		}
 		Assert(parallel_workers > 0);
 
-		ppath = (Path *)
+		papath = (Path *)
 			create_append_path(root, result_rel, NIL, partial_pathlist,
-							   NIL, NULL,
-							   parallel_workers, enable_parallel_append,
-							   -1);
-		ppath = (Path *)
-			create_gather_path(root, result_rel, ppath,
+							   NIL, NULL, parallel_workers,
+							   enable_parallel_append, -1);
+		gpath = (Path *)
+			create_gather_path(root, result_rel, papath,
 							   result_rel->reltarget, NULL, NULL);
-		if (!op->all)
-			ppath = make_union_unique(op, ppath, tlist, root);
-		add_path(result_rel, ppath);
 	}
 
-	/* Undo effects of possibly forcing tuple_fraction to 0 */
-	root->tuple_fraction = save_fraction;
+	if (!op->all)
+	{
+		double		dNumGroups;
+		bool		can_sort = grouping_is_sortable(groupList);
+		bool		can_hash = grouping_is_hashable(groupList);
+
+		/*
+		 * XXX for the moment, take the number of distinct groups as equal to
+		 * the total input size, i.e., the worst case.  This is too
+		 * conservative, but it's not clear how to get a decent estimate of
+		 * the true size.  One should note as well the propensity of novices
+		 * to write UNION rather than UNION ALL even when they don't expect
+		 * any duplicates...
+		 */
+		dNumGroups = apath->rows;
+
+		if (can_hash)
+		{
+			Path	   *path;
+
+			/*
+			 * Try a hash aggregate plan on 'apath'.  This is the cheapest
+			 * available path containing each append child.
+			 */
+			path = (Path *) create_agg_path(root,
+											result_rel,
+											apath,
+											create_pathtarget(root, tlist),
+											AGG_HASHED,
+											AGGSPLIT_SIMPLE,
+											groupList,
+											NIL,
+											NULL,
+											dNumGroups);
+			add_path(result_rel, path);
+
+			/* Try hash aggregate on the Gather path, if valid */
+			if (gpath != NULL)
+			{
+				/* Hashed aggregate plan --- no sort needed */
+				path = (Path *) create_agg_path(root,
+												result_rel,
+												gpath,
+												create_pathtarget(root, tlist),
+												AGG_HASHED,
+												AGGSPLIT_SIMPLE,
+												groupList,
+												NIL,
+												NULL,
+												dNumGroups);
+				add_path(result_rel, path);
+			}
+		}
+
+		if (can_sort)
+		{
+			Path	   *path = apath;
+
+			if (groupList != NIL)
+				path = (Path *) create_sort_path(root, result_rel, path,
+												 make_pathkeys_for_sortclauses(root, groupList, tlist),
+												 -1.0);
+
+			path = (Path *) create_upper_unique_path(root,
+													 result_rel,
+													 path,
+													 list_length(path->pathkeys),
+													 dNumGroups);
+
+			add_path(result_rel, path);
+
+			/* Try Sort -> Unique on the Gather path, if set */
+			if (gpath != NULL)
+			{
+				path = gpath;
+
+				path = (Path *) create_sort_path(root, result_rel, path,
+												 make_pathkeys_for_sortclauses(root, groupList, tlist),
+												 -1.0);
+
+				path = (Path *) create_upper_unique_path(root,
+														 result_rel,
+														 path,
+														 list_length(path->pathkeys),
+														 dNumGroups);
+				add_path(result_rel, path);
+			}
+		}
+
+		/*
+		 * Try making a MergeAppend path if we managed to find a path with the
+		 * correct pathkeys in each union child query.
+		 */
+		if (try_sorted && groupList != NIL)
+		{
+			Path	   *path;
+
+			path = (Path *) create_merge_append_path(root,
+													 result_rel,
+													 ordered_pathlist,
+													 union_pathkeys,
+													 NULL);
+
+			/* and make the MergeAppend unique */
+			path = (Path *) create_upper_unique_path(root,
+													 result_rel,
+													 path,
+													 list_length(tlist),
+													 path->rows);
+
+			add_path(result_rel, path);
+		}
+	}
+	else
+	{
+		/* UNION ALL */
+		add_path(result_rel, apath);
+
+		if (gpath != NULL)
+			add_path(result_rel, gpath);
+	}
 
 	return result_rel;
 }
@@ -724,6 +1125,8 @@ generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *root,
 			   *tlist,
 			   *groupList,
 			   *pathlist;
+	bool		lpath_trivial_tlist,
+				rpath_trivial_tlist;
 	double		dLeftGroups,
 				dRightGroups,
 				dNumGroups,
@@ -743,14 +1146,22 @@ generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *root,
 								  false, 0,
 								  refnames_tlist,
 								  &lpath_tlist,
+								  &lpath_trivial_tlist,
 								  &dLeftGroups);
+	if (lrel->rtekind == RTE_SUBQUERY)
+		build_setop_child_paths(root, lrel, lpath_trivial_tlist, lpath_tlist,
+								NIL);
 	lpath = lrel->cheapest_total_path;
 	rrel = recurse_set_operations(op->rarg, root,
 								  op->colTypes, op->colCollations,
 								  false, 1,
 								  refnames_tlist,
 								  &rpath_tlist,
+								  &rpath_trivial_tlist,
 								  &dRightGroups);
+	if (rrel->rtekind == RTE_SUBQUERY)
+		build_setop_child_paths(root, rrel, rpath_trivial_tlist, rpath_tlist,
+								NIL);
 	rpath = rrel->cheapest_total_path;
 
 	/* Undo effects of forcing tuple_fraction to 0 */
@@ -887,13 +1298,16 @@ static List *
 plan_union_children(PlannerInfo *root,
 					SetOperationStmt *top_union,
 					List *refnames_tlist,
-					List **tlist_list)
+					List **tlist_list,
+					List **istrivial_tlist)
 {
 	List	   *pending_rels = list_make1(top_union);
 	List	   *result = NIL;
 	List	   *child_tlist;
+	bool		trivial_tlist;
 
 	*tlist_list = NIL;
+	*istrivial_tlist = NIL;
 
 	while (pending_rels != NIL)
 	{
@@ -932,75 +1346,15 @@ plan_union_children(PlannerInfo *root,
 														false, -1,
 														refnames_tlist,
 														&child_tlist,
+														&trivial_tlist,
 														NULL));
 		*tlist_list = lappend(*tlist_list, child_tlist);
+		*istrivial_tlist = lappend_int(*istrivial_tlist, trivial_tlist);
 	}
 
 	return result;
 }
 
-/*
- * Add nodes to the given path tree to unique-ify the result of a UNION.
- */
-static Path *
-make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
-				  PlannerInfo *root)
-{
-	RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
-	List	   *groupList;
-	double		dNumGroups;
-
-	/* Identify the grouping semantics */
-	groupList = generate_setop_grouplist(op, tlist);
-
-	/*
-	 * XXX for the moment, take the number of distinct groups as equal to the
-	 * total input size, ie, the worst case.  This is too conservative, but
-	 * it's not clear how to get a decent estimate of the true size.  One
-	 * should note as well the propensity of novices to write UNION rather
-	 * than UNION ALL even when they don't expect any duplicates...
-	 */
-	dNumGroups = path->rows;
-
-	/* Decide whether to hash or sort */
-	if (choose_hashed_setop(root, groupList, path,
-							dNumGroups, dNumGroups,
-							"UNION"))
-	{
-		/* Hashed aggregate plan --- no sort needed */
-		path = (Path *) create_agg_path(root,
-										result_rel,
-										path,
-										create_pathtarget(root, tlist),
-										AGG_HASHED,
-										AGGSPLIT_SIMPLE,
-										groupList,
-										NIL,
-										NULL,
-										dNumGroups);
-	}
-	else
-	{
-		/* Sort and Unique */
-		if (groupList)
-			path = (Path *)
-				create_sort_path(root,
-								 result_rel,
-								 path,
-								 make_pathkeys_for_sortclauses(root,
-															   groupList,
-															   tlist),
-								 -1.0);
-		path = (Path *) create_upper_unique_path(root,
-												 result_rel,
-												 path,
-												 list_length(path->pathkeys),
-												 dNumGroups);
-	}
-
-	return path;
-}
-
 /*
  * postprocess_setop_rel - perform steps required after adding paths
  */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 534692bee1..b3069516b2 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -397,6 +397,8 @@ struct PlannerInfo
 	List	   *distinct_pathkeys;
 	/* sortClause pathkeys, if any */
 	List	   *sort_pathkeys;
+	/* set operator pathkeys, if any */
+	List	   *setop_pathkeys;
 
 	/* Canonicalised partition schemes used in the query. */
 	List	   *part_schemes pg_node_attr(read_write_ignore);
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 0e8a9c94ba..1165e98c5b 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -174,6 +174,9 @@ extern void add_child_join_rel_equivalences(PlannerInfo *root,
 											AppendRelInfo **appinfos,
 											RelOptInfo *parent_joinrel,
 											RelOptInfo *child_joinrel);
+extern void add_setop_child_rel_equivalences(PlannerInfo *root, Relids relids,
+											 List *tlist,
+											 List *setop_pathkeys);
 extern List *generate_implied_equalities_for_column(PlannerInfo *root,
 													RelOptInfo *rel,
 													ec_matches_callback_type callback,
diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h
index 8e00716dc8..a52dec285d 100644
--- a/src/include/optimizer/prep.h
+++ b/src/include/optimizer/prep.h
@@ -53,6 +53,6 @@ extern void preprocess_aggrefs(PlannerInfo *root, Node *clause);
  * prototypes for prepunion.c
  */
 extern RelOptInfo *plan_set_operations(PlannerInfo *root);
-
+extern bool set_operation_ordered_results_useful(SetOperationStmt *setop);
 
 #endif							/* PREP_H */
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index 7a05c75967..3b6c318a63 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -1396,6 +1396,7 @@ SELECT x FROM test3cs WHERE x ~ 'a';
  abc
 (1 row)
 
+SET enable_hashagg TO off;
 SELECT x FROM test1cs UNION SELECT x FROM test2cs ORDER BY x;
   x  
 -----
@@ -1448,6 +1449,7 @@ SELECT DISTINCT x FROM test3cs ORDER BY x;
  ghi
 (4 rows)
 
+RESET enable_hashagg;
 SELECT count(DISTINCT x) FROM test3cs;
  count 
 -------
diff --git a/src/test/regress/expected/incremental_sort.out b/src/test/regress/expected/incremental_sort.out
index 7fdb685313..5fd54a10b1 100644
--- a/src/test/regress/expected/incremental_sort.out
+++ b/src/test/regress/expected/incremental_sort.out
@@ -1472,14 +1472,19 @@ explain (costs off) select * from t union select * from t order by 1,3;
    Sort Key: t.a, t.c
    Presorted Key: t.a
    ->  Unique
-         ->  Sort
+         ->  Merge Append
                Sort Key: t.a, t.b, t.c
-               ->  Gather
+               ->  Gather Merge
                      Workers Planned: 2
-                     ->  Parallel Append
+                     ->  Sort
+                           Sort Key: t.a, t.b, t.c
                            ->  Parallel Seq Scan on t
+               ->  Gather Merge
+                     Workers Planned: 2
+                     ->  Sort
+                           Sort Key: t_1.a, t_1.b, t_1.c
                            ->  Parallel Seq Scan on t t_1
-(11 rows)
+(16 rows)
 
 -- Full sort, not just incremental sort can be pushed below a gather merge path
 -- by generate_useful_gather_paths.
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 73e320bad4..b8d6b112f1 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -370,16 +370,16 @@ select count(*) from
 explain (costs off)
 select count(*) from
   ( select unique1 from tenk1 intersect select fivethous from tenk1 ) ss;
-                                     QUERY PLAN                                     
-------------------------------------------------------------------------------------
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
  Aggregate
    ->  Subquery Scan on ss
          ->  HashSetOp Intersect
                ->  Append
-                     ->  Subquery Scan on "*SELECT* 2"
-                           ->  Seq Scan on tenk1
                      ->  Subquery Scan on "*SELECT* 1"
-                           ->  Index Only Scan using tenk1_unique1 on tenk1 tenk1_1
+                           ->  Index Only Scan using tenk1_unique1 on tenk1
+                     ->  Subquery Scan on "*SELECT* 2"
+                           ->  Seq Scan on tenk1 tenk1_1
 (8 rows)
 
 select count(*) from
@@ -412,16 +412,17 @@ set enable_hashagg to off;
 explain (costs off)
 select count(*) from
   ( select unique1 from tenk1 union select fivethous from tenk1 ) ss;
-                              QUERY PLAN                              
-----------------------------------------------------------------------
+                           QUERY PLAN                           
+----------------------------------------------------------------
  Aggregate
    ->  Unique
-         ->  Sort
+         ->  Merge Append
                Sort Key: tenk1.unique1
-               ->  Append
-                     ->  Index Only Scan using tenk1_unique1 on tenk1
+               ->  Index Only Scan using tenk1_unique1 on tenk1
+               ->  Sort
+                     Sort Key: tenk1_1.fivethous
                      ->  Seq Scan on tenk1 tenk1_1
-(7 rows)
+(8 rows)
 
 select count(*) from
   ( select unique1 from tenk1 union select fivethous from tenk1 ) ss;
@@ -433,18 +434,18 @@ select count(*) from
 explain (costs off)
 select count(*) from
   ( select unique1 from tenk1 intersect select fivethous from tenk1 ) ss;
-                                        QUERY PLAN                                        
-------------------------------------------------------------------------------------------
+                                    QUERY PLAN                                    
+----------------------------------------------------------------------------------
  Aggregate
    ->  Subquery Scan on ss
          ->  SetOp Intersect
                ->  Sort
-                     Sort Key: "*SELECT* 2".fivethous
+                     Sort Key: "*SELECT* 1".unique1
                      ->  Append
-                           ->  Subquery Scan on "*SELECT* 2"
-                                 ->  Seq Scan on tenk1
                            ->  Subquery Scan on "*SELECT* 1"
-                                 ->  Index Only Scan using tenk1_unique1 on tenk1 tenk1_1
+                                 ->  Index Only Scan using tenk1_unique1 on tenk1
+                           ->  Subquery Scan on "*SELECT* 2"
+                                 ->  Seq Scan on tenk1 tenk1_1
 (10 rows)
 
 select count(*) from
@@ -950,16 +951,8 @@ select except select;
 -- check hashed implementation
 set enable_hashagg = true;
 set enable_sort = false;
-explain (costs off)
-select from generate_series(1,5) union select from generate_series(1,3);
-                           QUERY PLAN                           
-----------------------------------------------------------------
- HashAggregate
-   ->  Append
-         ->  Function Scan on generate_series
-         ->  Function Scan on generate_series generate_series_1
-(4 rows)
-
+-- We've no way to check hashed UNION as the empty pathkeys in the Append are
+-- fine to make use of Unique, which is cheaper than HashAggregate.
 explain (costs off)
 select from generate_series(1,5) intersect select from generate_series(1,3);
                               QUERY PLAN                              
@@ -972,10 +965,6 @@ select from generate_series(1,5) intersect select from generate_series(1,3);
                ->  Function Scan on generate_series generate_series_1
 (6 rows)
 
-select from generate_series(1,5) union select from generate_series(1,3);
---
-(1 row)
-
 select from generate_series(1,5) union all select from generate_series(1,3);
 --
 (8 rows)
@@ -1102,16 +1091,17 @@ explain (costs off)
   UNION
   SELECT * FROM t2) t
  WHERE ab = 'ab';
-                    QUERY PLAN                     
----------------------------------------------------
- HashAggregate
-   Group Key: ((t1.a || t1.b))
-   ->  Append
-         ->  Index Scan using t1_ab_idx on t1
-               Index Cond: ((a || b) = 'ab'::text)
-         ->  Index Only Scan using t2_pkey on t2
-               Index Cond: (ab = 'ab'::text)
-(7 rows)
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: ((t1.a || t1.b))
+         ->  Append
+               ->  Index Scan using t1_ab_idx on t1
+                     Index Cond: ((a || b) = 'ab'::text)
+               ->  Index Only Scan using t2_pkey on t2
+                     Index Cond: (ab = 'ab'::text)
+(8 rows)
 
 --
 -- Test that ORDER BY for UNION ALL can be pushed down to inheritance
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index 3db9e25913..96d5d6b11c 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -555,6 +555,7 @@ SELECT x FROM test3cs WHERE x LIKE 'a%';
 SELECT x FROM test3cs WHERE x ILIKE 'a%';
 SELECT x FROM test3cs WHERE x SIMILAR TO 'a%';
 SELECT x FROM test3cs WHERE x ~ 'a';
+SET enable_hashagg TO off;
 SELECT x FROM test1cs UNION SELECT x FROM test2cs ORDER BY x;
 SELECT x FROM test2cs UNION SELECT x FROM test1cs ORDER BY x;
 SELECT x FROM test1cs INTERSECT SELECT x FROM test2cs;
@@ -562,6 +563,7 @@ SELECT x FROM test2cs INTERSECT SELECT x FROM test1cs;
 SELECT x FROM test1cs EXCEPT SELECT x FROM test2cs;
 SELECT x FROM test2cs EXCEPT SELECT x FROM test1cs;
 SELECT DISTINCT x FROM test3cs ORDER BY x;
+RESET enable_hashagg;
 SELECT count(DISTINCT x) FROM test3cs;
 SELECT x, count(*) FROM test3cs GROUP BY x ORDER BY x;
 SELECT x, row_number() OVER (ORDER BY x), rank() OVER (ORDER BY x) FROM test3cs ORDER BY x;
diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql
index 6c509ac80c..3be29acb04 100644
--- a/src/test/regress/sql/union.sql
+++ b/src/test/regress/sql/union.sql
@@ -302,12 +302,11 @@ select except select;
 set enable_hashagg = true;
 set enable_sort = false;
 
-explain (costs off)
-select from generate_series(1,5) union select from generate_series(1,3);
+-- We've no way to check hashed UNION as the empty pathkeys in the Append are
+-- fine to make use of Unique, which is cheaper than HashAggregate.
 explain (costs off)
 select from generate_series(1,5) intersect select from generate_series(1,3);
 
-select from generate_series(1,5) union select from generate_series(1,3);
 select from generate_series(1,5) union all select from generate_series(1,3);
 select from generate_series(1,5) intersect select from generate_series(1,3);
 select from generate_series(1,5) intersect all select from generate_series(1,3);

David Rowley

dgrowleyml@gmail.com

almost 2 years ago

In reply to: Andy Fan (#6)

Re: Properly pathify the union planner

On Wed, 7 Feb 2024 at 12:05, Andy Fan <zhihuifan1213@163.com> wrote:

+static int
+pathkeys_useful_for_setop(PlannerInfo *root, List *pathkeys)
+{
+       int                     n_common_pathkeys;
+
+       if (root->setop_pathkeys == NIL)
+               return 0;                               /* no special setop ordering requested */
+
+       if (pathkeys == NIL)
+               return 0;                               /* unordered path */
+
+       (void) pathkeys_count_contained_in(root->setop_pathkeys, pathkeys,
+                                                                          &n_common_pathkeys);
+
+       return n_common_pathkeys;
+}

The two if-clauses looks unnecessary, it should be handled by
pathkeys_count_contained_in already. The same issue exists in
pathkeys_useful_for_ordering as well. Attached patch fix it in master.

I agree. I'd rather not have those redundant checks in
pathkeys_useful_for_setop(), and I do want those functions to be as
similar as possible. So I think adjusting it in master is a good
idea.

I've pushed your patch.

David

Andy Fan

zhihuifan1213@163.com

almost 2 years ago

In reply to: David Rowley (#8)

Re: Properly pathify the union planner

David Rowley <dgrowleyml@gmail.com> writes:

The two if-clauses looks unnecessary, it should be handled by
pathkeys_count_contained_in already. The same issue exists in
pathkeys_useful_for_ordering as well. Attached patch fix it in master.

I agree. I'd rather not have those redundant checks in
pathkeys_useful_for_setop(), and I do want those functions to be as
similar as possible. So I think adjusting it in master is a good
idea.

I've pushed your patch.

Thanks for the pushing!

--
Best Regards
Andy Fan

#10

David Rowley

dgrowleyml@gmail.com

almost 2 years ago

In reply to: David Rowley (#7)

Re: Properly pathify the union planner

On Thu, 15 Feb 2024 at 17:30, David Rowley <dgrowleyml@gmail.com> wrote:

On Tue, 6 Feb 2024 at 22:05, Richard Guo <guofenglinux@gmail.com> wrote:

I'm thinking that maybe it'd be better to move the work of sorting the
subquery's paths to the outer query level, specifically within the
build_setop_child_paths() function, just before we stick SubqueryScanPath
on top of the subquery's paths. I think this is better because:

1. This minimizes the impact on subquery planning and reduces the
footprint within the grouping_planner() function as much as possible.

2. This can help avoid the aforementioned add_path() issue because the
two involved paths will be structured as:

Yes, this is a good idea. I agree with both of your points.

v2 attached.

If anyone else or if you want to take another look, let me know soon.
Otherwise, I'll assume that's the reviews over and I can take another
look again.

If nobody speaks up before Monday next week (11th), New Zealand time,
I'm going to be looking at this again from the point of view of
committing it.

Thanks

David

#11

Richard Guo

guofenglinux@gmail.com

almost 2 years ago

In reply to: David Rowley (#10)

Re: Properly pathify the union planner

On Thu, Mar 7, 2024 at 7:16 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Thu, 15 Feb 2024 at 17:30, David Rowley <dgrowleyml@gmail.com> wrote:

On Tue, 6 Feb 2024 at 22:05, Richard Guo <guofenglinux@gmail.com> wrote:

I'm thinking that maybe it'd be better to move the work of sorting the
subquery's paths to the outer query level, specifically within the
build_setop_child_paths() function, just before we stick

SubqueryScanPath

on top of the subquery's paths. I think this is better because:

1. This minimizes the impact on subquery planning and reduces the
footprint within the grouping_planner() function as much as possible.

2. This can help avoid the aforementioned add_path() issue because the
two involved paths will be structured as:

Yes, this is a good idea. I agree with both of your points.

v2 attached.

If anyone else or if you want to take another look, let me know soon.
Otherwise, I'll assume that's the reviews over and I can take another
look again.

Hi David,

I would like to have another look, but it might take several days.
Would that be too late?

Thanks
Richard

#12

David Rowley

dgrowleyml@gmail.com

almost 2 years ago

In reply to: Richard Guo (#11)

Re: Properly pathify the union planner

On Fri, 8 Mar 2024 at 00:39, Richard Guo <guofenglinux@gmail.com> wrote:

I would like to have another look, but it might take several days.
Would that be too late?

Please look. Several days is fine. I'd like to start looking on Monday
or Tuesday next week.

Thanks

David

#13

Richard Guo

guofenglinux@gmail.com

almost 2 years ago

In reply to: David Rowley (#12)

Re: Properly pathify the union planner

On Fri, Mar 8, 2024 at 11:31 AM David Rowley <dgrowleyml@gmail.com> wrote:

On Fri, 8 Mar 2024 at 00:39, Richard Guo <guofenglinux@gmail.com> wrote:

I would like to have another look, but it might take several days.
Would that be too late?

Please look. Several days is fine. I'd like to start looking on Monday
or Tuesday next week.

I've had another look, and here are some comments that came to my mind.

* There are cases where the setop_pathkeys of a subquery does not match
the union_pathkeys generated in generate_union_paths() for sorting by
the whole target list. In such cases, even if we have explicitly sorted
the paths of the subquery to meet the ordering required for its
setop_pathkeys, we cannot find appropriate ordered paths for
MergeAppend. For instance,

explain (costs off)
(select a, b from t t1 where a = b) UNION (select a, b from t t2 where a =
b);
QUERY PLAN
-----------------------------------------------------------
Unique
-> Sort
Sort Key: t1.a, t1.b
-> Append
-> Index Only Scan using t_a_b_idx on t t1
Filter: (a = b)
-> Index Only Scan using t_a_b_idx on t t2
Filter: (a = b)
(8 rows)

(Assume t_a_b_idx is a btree index on t(a, b))

In this query, the setop_pathkeys of the subqueries includes only one
PathKey because 'a' and 'b' are in the same EC inside the subqueries,
while the union_pathkeys of the whole query includes two PathKeys, one
for each target entry. After we convert the setop_pathkeys to outer
representation, we'd notice that it does not match union_pathkeys.
Consequently, we are unable to recognize that the index scan paths are
already appropriately sorted, leading us to miss the opportunity to
utilize MergeAppend.

Not sure if this case is common enough to be worth paying attention to.

* In build_setop_child_paths() we also create properly sorted partial
paths, which seems not necessary because we do not support parallel
merge append, right?

* Another is minor and relates to cosmetic matters. When we unique-ify
the result of a UNION, we take the number of distinct groups as equal to
the total input size. For the Append path and Gather path, we use
'dNumGroups', which is 'rows' of the Append path. For the MergeAppend
we use 'rows' of the MergeAppend path. I believe they are supposed to
be the same, but I think it'd be better to keep them consistent: either
use 'dNumGroups' for all the three kinds of paths, or use 'path->rows'
for each path.

Thanks
Richard

#14

David Rowley

dgrowleyml@gmail.com

almost 2 years ago

In reply to: Richard Guo (#13)

2 attachment(s)

Re: Properly pathify the union planner

On Mon, 11 Mar 2024 at 19:56, Richard Guo <guofenglinux@gmail.com> wrote:

* There are cases where the setop_pathkeys of a subquery does not match
the union_pathkeys generated in generate_union_paths() for sorting by
the whole target list. In such cases, even if we have explicitly sorted
the paths of the subquery to meet the ordering required for its
setop_pathkeys, we cannot find appropriate ordered paths for
MergeAppend. For instance,

explain (costs off)
(select a, b from t t1 where a = b) UNION (select a, b from t t2 where a = b);
QUERY PLAN
-----------------------------------------------------------
Unique
-> Sort
Sort Key: t1.a, t1.b
-> Append
-> Index Only Scan using t_a_b_idx on t t1
Filter: (a = b)
-> Index Only Scan using t_a_b_idx on t t2
Filter: (a = b)
(8 rows)

(Assume t_a_b_idx is a btree index on t(a, b))

In this query, the setop_pathkeys of the subqueries includes only one
PathKey because 'a' and 'b' are in the same EC inside the subqueries,
while the union_pathkeys of the whole query includes two PathKeys, one
for each target entry. After we convert the setop_pathkeys to outer
representation, we'd notice that it does not match union_pathkeys.
Consequently, we are unable to recognize that the index scan paths are
already appropriately sorted, leading us to miss the opportunity to
utilize MergeAppend.

Not sure if this case is common enough to be worth paying attention to.

I've spent almost all day looking into this, which is just enough work
to satisfy myself this *is* future work rather than v17 work.

The reason I feel this is future work rather than work for this patch
is that this is already a limitation of subqueries in general and it's
not unique to union child queries.

For example:

create table ab(a int, b int, primary key(a,b));
insert into ab select x,y from generate_series(1,100)x,generate_series(1,100)y;
vacuum analyze ab;
explain select * from (select a,b from ab where a = 1 order by a,b
limit 10) order by a,b;

The current plan for this is:

QUERY PLAN
-------------------------------------------------
Sort
Sort Key: ab.a, ab.b
-> Limit
-> Index Only Scan using ab_pkey on ab
Index Cond: (a = 1)
(5 rows)

The additional sort isn't required but is added because the outer
query requires the pathkeys {a,b} and the inner query only has the
pathkey {b}. {a} is removed due to it being redundant because of the
const member. The outer query does not know about the redundant
pathkeys so think the subquery is only sorted by {b} therefore adds
the sort on "a", "b".

The attached 0001 patch (renamed as .txt so it's ignored by the CFbot)
adjusts convert_subquery_pathkeys() to have it look a bit deeper and
try harder to match the path to the outer query's query_pathkeys.
After patching with that, the plan becomes:

QUERY PLAN
-------------------------------------------
Limit
-> Index Only Scan using ab_pkey on ab
Index Cond: (a = 1)
(3 rows)

The patch is still incomplete as the matching is quite complex for the
case you mentioned with a=b. It's a bit late here to start making
that work, but I think the key to make that work is to give
subquery_matches_pathkeys() an extra parameter or 2 for where to start
working on each list and recursively call itself where I've left the
TODO comment in the function and on the recursive call, try the next
query_pathkeys and the same sub_pathkey. If the recursive call
function returns false, continue on trying to match the normal way. If
it returns true, return true.

There'd be a bit more work elsewhere to do to make this work for the
general case. For example:

explain (costs off) select * from (select a,b from ab where a = 1
offset 0) order by a,b;

still produces the following plan with the patched version:

QUERY PLAN
-------------------------------------------
Sort
Sort Key: ab.a, ab.b
-> Index Only Scan using ab_pkey on ab
Index Cond: (a = 1)
(4 rows)

the reason for this is that the subquery does not know the outer query
would like the index path to have the pathkeys {a,b}. I've not
looked at this issue in detail but I suspect we could do something
somewhere like set_subquery_pathlist() to check if the outer query's
query_pathkeys are all Vars from the subquery. I think we'd need to
trawl through the eclasses to ensure that each query_pathkeys eclasses
contains a member mentioning only a Var from the subquery and if so,
get the subquery to set those pathkeys.

Anyway, this is all at least PG18 work so I'll raise a thread about it
around June. The patch is included in case you want to mess around
with it. I'd be happy if you want to look into the subquery pathkey
issue portion of the work. I won't be giving this much more focus
until after the freeze.

* In build_setop_child_paths() we also create properly sorted partial
paths, which seems not necessary because we do not support parallel
merge append, right?

Yeah. Thanks for noticing that. Removing all that saves quite a bit more code.

* Another is minor and relates to cosmetic matters. When we unique-ify
the result of a UNION, we take the number of distinct groups as equal to
the total input size. For the Append path and Gather path, we use
'dNumGroups', which is 'rows' of the Append path. For the MergeAppend
we use 'rows' of the MergeAppend path. I believe they are supposed to
be the same, but I think it'd be better to keep them consistent: either
use 'dNumGroups' for all the three kinds of paths, or use 'path->rows'
for each path.

Yeah, that should use dNumGroups. Well spotted.

I've attached v3.

David

Attachments:

v0-0001-Enhance-convert_subquery_pathkeys.patch.txttext/plain; charset=US-ASCII; name=v0-0001-Enhance-convert_subquery_pathkeys.patch.txtDownload

From 52250debeeb6dd70e4f3cb2aa992dda7c091eb4c Mon Sep 17 00:00:00 2001
From: David Rowley <dgrowley@gmail.com>
Date: Tue, 12 Mar 2024 22:33:11 +1300
Subject: [PATCH v0] Enhance convert_subquery_pathkeys

Have this try to use query_pathkeys if those keys are compatible with
the subquery.  This can eliminate sorting of subqueries by working a bit
harder to see if certain pathkeys in the subquery are not present
because they're redundant.

A bit more work needs to be done here.  subquery_matches_pathkeys()
needs to be further enhanced to search the next query_pathkey once it
matches.  This will allow queries such as:

select * from (select a,b from t where a=b limit 10) order by a,b;

to use an index on t(a,b).  subquery_matches_pathkeys currently fails to
match this as it skips to the next sub_pathkey, of which there is none.
---
 src/backend/optimizer/path/pathkeys.c | 123 ++++++++++++++++++++++++++
 1 file changed, 123 insertions(+)

diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index 3f1a4050e7..8003821e06 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -1049,6 +1049,120 @@ build_expression_pathkey(PlannerInfo *root,
 	return pathkeys;
 }
 
+static EquivalenceClass *
+find_subquery_eclass(EquivalenceClass *eclass, RelOptInfo *rel, List *subquery_tlist)
+{
+	ListCell *lc;
+
+	foreach (lc, eclass->ec_members)
+	{
+		EquivalenceMember *em = lfirst_node(EquivalenceMember, lc);
+		ListCell *lc2;
+
+		/*
+		 * Ignore child members unless they belong to the requested rel.
+		 */
+		if (em->em_is_child && !bms_is_subset(em->em_relids, rel->relids))
+			continue;
+
+		foreach (lc2, subquery_tlist)
+		{
+			TargetEntry *tle = (TargetEntry *) lfirst(lc2);
+			EquivalenceClass *sub_ec;
+			Var *outer_var;
+
+			outer_var = find_var_for_subquery_tle(rel, tle);
+
+			if (!equal(outer_var, em->em_expr))
+				continue;
+
+			sub_ec = get_eclass_for_sort_expr(rel->subroot,
+											  (Expr *) tle->expr,
+											  eclass->ec_opfamilies,
+											  em->em_datatype,
+											  exprCollation((Node *) em->em_expr),
+											  0,
+											  NULL,
+											  false);
+
+			if (sub_ec)
+				return sub_ec;
+		}
+	}
+
+	return NULL;
+}
+
+/*
+ * subquery_matches_pathkeys
+ *		Check if the subquery in 'rel' with the given 'subquery_pathkeys'
+ *		implements 'query_pathkeys', taking into account that some of the
+ *		given 'subquery_pathkeys' may have been marked as redundant or
+ *		consecutive query_pathkeys may have been merged into a single
+ *		EquivalenceClass in the subquery.
+ *
+ * 'query_pathkeys' must not be an empty List.
+ */
+static bool
+subquery_matches_pathkeys(PlannerInfo *root, RelOptInfo *rel,
+						  List *subquery_pathkeys, List *subquery_tlist,
+						  List *query_pathkeys)
+{
+	ListCell *sub_lc = list_head(subquery_pathkeys);
+	ListCell *lc;
+
+	Assert(query_pathkeys != NIL);
+
+	foreach (lc, query_pathkeys)
+	{
+		PathKey *pathkey = lfirst_node(PathKey, lc);
+		EquivalenceClass *eclass = pathkey->pk_eclass;
+		EquivalenceClass *sub_eclass;
+		PathKey *sub_pathkey;
+
+		/*
+		 * Look through every member in this eclass and look to see if there's
+		 * a subquery target mentioning one of the eclass's members.  If found
+		 * search for an eclass for that subquery target and return it.
+		 */
+		sub_eclass = find_subquery_eclass(eclass, rel, subquery_tlist);
+
+		if (sub_eclass == NULL)
+			return false;
+
+		if (EC_MUST_BE_REDUNDANT(sub_eclass))
+		{
+			/*
+			 * The subquery pathkey must have been removed because it's redundant.
+			 * skip to the next query_pathkey
+			 */
+			continue;
+		}
+
+		if (sub_lc == NULL)
+			return false;
+
+		sub_pathkey = lfirst_node(PathKey, sub_lc);
+
+		/* if the eclass matches then skip to the next sub_pathkey */
+		if (sub_eclass == sub_pathkey->pk_eclass)
+		{
+			/* matching eclasses.  Check for matching properties */
+			if (sub_pathkey->pk_strategy != pathkey->pk_strategy ||
+				sub_pathkey->pk_nulls_first != pathkey->pk_nulls_first)
+				return false;
+
+			/* 1:1 match. Skip to the next pathkeys */
+			/* TODO try matching this sub_pathkey to the next query_pathkey */
+			sub_lc = lnext(subquery_pathkeys, sub_lc);
+		}
+		else
+			return false;
+	}
+
+	return true;
+}
+
 /*
  * convert_subquery_pathkeys
  *	  Build a pathkeys list that describes the ordering of a subquery's
@@ -1075,6 +1189,15 @@ convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel,
 	int			outer_query_keys = list_length(root->query_pathkeys);
 	ListCell   *i;
 
+	/* XXX should this be a List of Lists of interesting pathkeys? */
+	if (root->query_pathkeys != NIL &&
+		subquery_matches_pathkeys(root,
+								  rel,
+								  subquery_pathkeys,
+								  subquery_tlist,
+								  root->query_pathkeys))
+		return root->query_pathkeys;
+
 	foreach(i, subquery_pathkeys)
 	{
 		PathKey    *sub_pathkey = (PathKey *) lfirst(i);
-- 
2.40.1.windows.1

better_union_planner_v3.patchtext/plain; charset=US-ASCII; name=better_union_planner_v3.patchDownload

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 58a603ac56..70fb791306 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -11495,6 +11495,10 @@ DROP INDEX base_tbl1_idx;
 DROP INDEX base_tbl2_idx;
 DROP INDEX async_p3_idx;
 -- UNION queries
+SET enable_sort TO off;
+SET enable_incremental_sort TO off;
+-- Adjust fdw_startup_cost so that we get an unordered path in the Append.
+ALTER SERVER loopback2 OPTIONS (ADD fdw_startup_cost '0.00');
 EXPLAIN (VERBOSE, COSTS OFF)
 INSERT INTO result_tbl
 (SELECT a, b, 'AAA' || c FROM async_p1 ORDER BY a LIMIT 10)
@@ -11576,6 +11580,9 @@ SELECT * FROM result_tbl ORDER BY a;
 (12 rows)
 
 DELETE FROM result_tbl;
+RESET enable_incremental_sort;
+RESET enable_sort;
+ALTER SERVER loopback2 OPTIONS (DROP fdw_startup_cost);
 -- Disable async execution if we use gating Result nodes for pseudoconstant
 -- quals
 EXPLAIN (VERBOSE, COSTS OFF)
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index e3d147de6d..5fffc4c53b 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3877,6 +3877,11 @@ DROP INDEX base_tbl2_idx;
 DROP INDEX async_p3_idx;
 
 -- UNION queries
+SET enable_sort TO off;
+SET enable_incremental_sort TO off;
+-- Adjust fdw_startup_cost so that we get an unordered path in the Append.
+ALTER SERVER loopback2 OPTIONS (ADD fdw_startup_cost '0.00');
+
 EXPLAIN (VERBOSE, COSTS OFF)
 INSERT INTO result_tbl
 (SELECT a, b, 'AAA' || c FROM async_p1 ORDER BY a LIMIT 10)
@@ -3903,6 +3908,10 @@ UNION ALL
 SELECT * FROM result_tbl ORDER BY a;
 DELETE FROM result_tbl;
 
+RESET enable_incremental_sort;
+RESET enable_sort;
+ALTER SERVER loopback2 OPTIONS (DROP fdw_startup_cost);
+
 -- Disable async execution if we use gating Result nodes for pseudoconstant
 -- quals
 EXPLAIN (VERBOSE, COSTS OFF)
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 4bd60a09c6..c036911309 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -2867,6 +2867,58 @@ add_child_join_rel_equivalences(PlannerInfo *root,
 	MemoryContextSwitchTo(oldcontext);
 }
 
+/*
+ * add_setop_child_rel_equivalences
+ *		Add equivalence members for each non-resjunk target in 'child_tlist'
+ *		to the EquivalenceClass in the corresponding setop_pathkey's pk_class.
+ *
+ * 'root' is the PlannerInfo belonging to the top-level set operation.
+ * 'child_relids' are the Relids of the child relation we're adding
+ * EquivalenceMembers for.
+ * 'child_tlist' is the target list for the setop child relation.  The target
+ * list expressions are what we add as EquivalenceMembers.
+ * 'setop_pathkeys' is a list of PathKeys which must contain an entry for each
+ * non-resjunk target in 'child_tlist'.
+ */
+void
+add_setop_child_rel_equivalences(PlannerInfo *root, Relids child_relids,
+								 List *child_tlist, List *setop_pathkeys)
+{
+	ListCell   *lc;
+	ListCell   *lc2 = list_head(setop_pathkeys);
+
+	foreach(lc, child_tlist)
+	{
+		TargetEntry *tle = lfirst_node(TargetEntry, lc);
+		EquivalenceMember *parent_em;
+		PathKey    *pk;
+
+		if (tle->resjunk)
+			continue;
+
+		if (lc2 == NULL)
+			elog(ERROR, "too few pathkeys for set operation");
+
+		pk = lfirst_node(PathKey, lc2);
+		parent_em = linitial(pk->pk_eclass->ec_members);
+
+		/*
+		 * We can safely pass the parent member as the first member in the
+		 * ec_members list as this is added first in generate_union_paths,
+		 * likewise, the JoinDomain can be that of the initial member of the
+		 * Pathkey's EquivalenceClass.
+		 */
+		add_eq_member(pk->pk_eclass,
+					  tle->expr,
+					  child_relids,
+					  parent_em->em_jdomain,
+					  parent_em,
+					  exprType((Node *) tle->expr));
+
+		lc2 = lnext(setop_pathkeys, lc2);
+	}
+}
+
 
 /*
  * generate_implied_equalities_for_column
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index 3f1a4050e7..1d61881a6b 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -2191,6 +2191,22 @@ pathkeys_useful_for_grouping(PlannerInfo *root, List *pathkeys)
 	return n;
 }
 
+/*
+ * pathkeys_useful_for_setop
+ *		Count the number of leading common pathkeys root's 'setop_pathkeys' in
+ *		'pathkeys'.
+ */
+static int
+pathkeys_useful_for_setop(PlannerInfo *root, List *pathkeys)
+{
+	int			n_common_pathkeys;
+
+	(void) pathkeys_count_contained_in(root->setop_pathkeys, pathkeys,
+									   &n_common_pathkeys);
+
+	return n_common_pathkeys;
+}
+
 /*
  * truncate_useless_pathkeys
  *		Shorten the given pathkey list to just the useful pathkeys.
@@ -2208,6 +2224,9 @@ truncate_useless_pathkeys(PlannerInfo *root,
 	if (nuseful2 > nuseful)
 		nuseful = nuseful2;
 	nuseful2 = pathkeys_useful_for_grouping(root, pathkeys);
+	if (nuseful2 > nuseful)
+		nuseful = nuseful2;
+	nuseful2 = pathkeys_useful_for_setop(root, pathkeys);
 	if (nuseful2 > nuseful)
 		nuseful = nuseful2;
 
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 443ab08d75..402c573812 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -54,6 +54,7 @@
 #include "optimizer/tlist.h"
 #include "parser/analyze.h"
 #include "parser/parse_agg.h"
+#include "parser/parse_clause.h"
 #include "parser/parse_relation.h"
 #include "parser/parsetree.h"
 #include "partitioning/partdesc.h"
@@ -119,6 +120,8 @@ typedef struct
 {
 	List	   *activeWindows;	/* active windows, if any */
 	grouping_sets_data *gset_data;	/* grouping sets data, if any */
+	SetOperationStmt *setop;	/* parent set operation or NULL if not a
+								 * subquery belonging to a set operation */
 } standard_qp_extra;
 
 /* Local functions */
@@ -249,6 +252,8 @@ static bool group_by_has_partkey(RelOptInfo *input_rel,
 								 List *targetList,
 								 List *groupClause);
 static int	common_prefix_cmp(const void *a, const void *b);
+static List *generate_setop_child_grouplist(SetOperationStmt *op,
+											List *targetlist);
 
 
 /*****************************************************************************
@@ -1500,6 +1505,18 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 		qp_extra.activeWindows = activeWindows;
 		qp_extra.gset_data = gset_data;
 
+		/*
+		 * Check if we're a subquery for a set operation.  If we are, store
+		 * the SetOperationStmt in qp_extra.
+		 */
+		if (root->parent_root != NULL &&
+			root->parent_root->parse->setOperations != NULL &&
+			IsA(root->parent_root->parse->setOperations, SetOperationStmt))
+			qp_extra.setop =
+				(SetOperationStmt *) root->parent_root->parse->setOperations;
+		else
+			qp_extra.setop = NULL;
+
 		/*
 		 * Generate the best unsorted and presorted paths for the scan/join
 		 * portion of this Query, ie the processing represented by the
@@ -3433,6 +3450,27 @@ standard_qp_callback(PlannerInfo *root, void *extra)
 									  parse->sortClause,
 									  tlist);
 
+	/* setting setop_pathkeys might be useful to the union planner */
+	if (qp_extra->setop != NULL &&
+		set_operation_ordered_results_useful(qp_extra->setop))
+	{
+		List	   *groupClauses;
+		bool		sortable;
+
+		groupClauses = generate_setop_child_grouplist(qp_extra->setop, tlist);
+
+		root->setop_pathkeys =
+			make_pathkeys_for_sortclauses_extended(root,
+												   &groupClauses,
+												   tlist,
+												   false,
+												   &sortable);
+		if (!sortable)
+			root->setop_pathkeys = NIL;
+	}
+	else
+		root->setop_pathkeys = NIL;
+
 	/*
 	 * Figure out whether we want a sorted result from query_planner.
 	 *
@@ -3442,7 +3480,9 @@ standard_qp_callback(PlannerInfo *root, void *extra)
 	 * sortable DISTINCT clause that's more rigorous than the ORDER BY clause,
 	 * we try to produce output that's sufficiently well sorted for the
 	 * DISTINCT.  Otherwise, if there is an ORDER BY clause, we want to sort
-	 * by the ORDER BY clause.
+	 * by the ORDER BY clause.  Otherwise, if we're a subquery being planned
+	 * for a set operation which can benefit from presorted results and have a
+	 * sortable targetlist, we want to sort by the target list.
 	 *
 	 * Note: if we have both ORDER BY and GROUP BY, and ORDER BY is a superset
 	 * of GROUP BY, it would be tempting to request sort by ORDER BY --- but
@@ -3460,6 +3500,8 @@ standard_qp_callback(PlannerInfo *root, void *extra)
 		root->query_pathkeys = root->distinct_pathkeys;
 	else if (root->sort_pathkeys)
 		root->query_pathkeys = root->sort_pathkeys;
+	else if (root->setop_pathkeys != NIL)
+		root->query_pathkeys = root->setop_pathkeys;
 	else
 		root->query_pathkeys = NIL;
 }
@@ -7855,3 +7897,47 @@ group_by_has_partkey(RelOptInfo *input_rel,
 
 	return true;
 }
+
+/*
+ * generate_setop_child_grouplist
+ *		Build a SortGroupClause list defining the sort/grouping properties
+ *		of the child of a set operation.
+ *
+ * This is similar to generate_setop_grouplist() but differs as the setop
+ * child query's targetlist entries may already have a tleSortGroupRef
+ * assigned for other purposes, such as GROUP BYs.  Here we keep the
+ * SortGroupClause list in the same order as 'op' groupClauses and just adjust
+ * the tleSortGroupRef to reference the TargetEntry's 'ressortgroupref'.
+ */
+static List *
+generate_setop_child_grouplist(SetOperationStmt *op, List *targetlist)
+{
+	List	   *grouplist = copyObject(op->groupClauses);
+	ListCell   *lg;
+	ListCell   *lt;
+
+	lg = list_head(grouplist);
+	foreach(lt, targetlist)
+	{
+		TargetEntry *tle = (TargetEntry *) lfirst(lt);
+		SortGroupClause *sgc;
+
+		if (tle->resjunk)
+		{
+			/* resjunk columns should not have sortgrouprefs */
+			Assert(tle->ressortgroupref == 0);
+			continue;			/* ignore resjunk columns */
+		}
+
+		/* non-resjunk columns should have grouping clauses */
+		Assert(lg != NULL);
+		sgc = (SortGroupClause *) lfirst(lg);
+		lg = lnext(grouplist, lg);
+
+		/* assign a tleSortGroupRef, or reuse the existing one */
+		sgc->tleSortGroupRef = assignSortGroupRef(tle, targetlist);
+		tle->ressortgroupref = sgc->tleSortGroupRef;
+	}
+	Assert(lg == NULL);
+	return grouplist;
+}
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index a5bfd7a3f7..a5ff8329b8 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -43,11 +43,15 @@ static RelOptInfo *recurse_set_operations(Node *setOp, PlannerInfo *root,
 										  bool junkOK,
 										  int flag, List *refnames_tlist,
 										  List **pTargetList,
+										  bool *istrivial_tlist,
 										  double *pNumGroups);
 static RelOptInfo *generate_recursion_path(SetOperationStmt *setOp,
 										   PlannerInfo *root,
 										   List *refnames_tlist,
 										   List **pTargetList);
+static void build_setop_child_paths(PlannerInfo *root, RelOptInfo *rel,
+									bool trivial_tlist, List *child_tlist,
+									List *interesting_pathkeys);
 static RelOptInfo *generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 										List *refnames_tlist,
 										List **pTargetList);
@@ -57,9 +61,8 @@ static RelOptInfo *generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *ro
 static List *plan_union_children(PlannerInfo *root,
 								 SetOperationStmt *top_union,
 								 List *refnames_tlist,
-								 List **tlist_list);
-static Path *make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
-							   PlannerInfo *root);
+								 List **tlist_list,
+								 List **istrivial_tlist);
 static void postprocess_setop_rel(PlannerInfo *root, RelOptInfo *rel);
 static bool choose_hashed_setop(PlannerInfo *root, List *groupClauses,
 								Path *input_path,
@@ -152,6 +155,8 @@ plan_set_operations(PlannerInfo *root)
 	}
 	else
 	{
+		bool		trivial_tlist;
+
 		/*
 		 * Recurse on setOperations tree to generate paths for set ops. The
 		 * final output paths should have just the column types shown as the
@@ -163,6 +168,7 @@ plan_set_operations(PlannerInfo *root)
 										   true, -1,
 										   leftmostQuery->targetList,
 										   &top_tlist,
+										   &trivial_tlist,
 										   NULL);
 	}
 
@@ -172,6 +178,33 @@ plan_set_operations(PlannerInfo *root)
 	return setop_rel;
 }
 
+/*
+ * set_operation_ordered_results_useful
+ *		Return true if the given SetOperationStmt can be executed by utilizing
+ *		paths that provide sorted input according to the setop's targetlist.
+ *		Returns false when sorted paths are not any more useful then unsorted
+ *		ones.
+ */
+bool
+set_operation_ordered_results_useful(SetOperationStmt *setop)
+{
+	/*
+	 * Paths sorted by the targetlist are useful for UNION as we can opt to
+	 * MergeAppend the sorted paths then Unique them.  Ordered paths are no
+	 * more useful than unordered ones for UNION ALL.
+	 */
+	if (!setop->all && setop->op == SETOP_UNION)
+		return true;
+
+	/*
+	 * EXCEPT / EXCEPT ALL / INTERSECT / INTERSECT ALL cannot yet utilize
+	 * correctly sorted input paths.
+	 */
+
+	/* sorted paths are not deemed of any additional use for this setop */
+	return false;
+}
+
 /*
  * recurse_set_operations
  *	  Recursively handle one step in a tree of set operations
@@ -192,6 +225,10 @@ plan_set_operations(PlannerInfo *root)
  * the logic in this file depends on flag columns being marked resjunk.
  * Pending a redesign of how that works, this is the easy way out.
  *
+ * 'istrivial_tlist' is an output parameter and is set to true when the
+ * pTargetList is deemed to be a trivial target list as determined by
+ * generate_setop_tlist().
+ *
  * We don't have to care about typmods here: the only allowed difference
  * between set-op input and output typmods is input is a specific typmod
  * and output is -1, and that does not require a coercion.
@@ -202,9 +239,12 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 					   bool junkOK,
 					   int flag, List *refnames_tlist,
 					   List **pTargetList,
+					   bool *istrivial_tlist,
 					   double *pNumGroups)
 {
-	RelOptInfo *rel = NULL;		/* keep compiler quiet */
+	RelOptInfo *rel;
+
+	*istrivial_tlist = true;	/* for now */
 
 	/* Guard against stack overflow due to overly complex setop nests */
 	check_stack_depth();
@@ -216,8 +256,6 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		Query	   *subquery = rte->subquery;
 		PlannerInfo *subroot;
 		RelOptInfo *final_rel;
-		Path	   *subpath;
-		Path	   *path;
 		List	   *tlist;
 		bool		trivial_tlist;
 
@@ -254,61 +292,8 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 
 		/* Return the fully-fledged tlist to caller, too */
 		*pTargetList = tlist;
-
-		/*
-		 * Mark rel with estimated output rows, width, etc.  Note that we have
-		 * to do this before generating outer-query paths, else
-		 * cost_subqueryscan is not happy.
-		 */
-		set_subquery_size_estimates(root, rel);
-
-		/*
-		 * Since we may want to add a partial path to this relation, we must
-		 * set its consider_parallel flag correctly.
-		 */
+		*istrivial_tlist = trivial_tlist;
 		final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
-		rel->consider_parallel = final_rel->consider_parallel;
-
-		/*
-		 * For the moment, we consider only a single Path for the subquery.
-		 * This should change soon (make it look more like
-		 * set_subquery_pathlist).
-		 */
-		subpath = get_cheapest_fractional_path(final_rel,
-											   root->tuple_fraction);
-
-		/*
-		 * Stick a SubqueryScanPath atop that.
-		 *
-		 * We don't bother to determine the subquery's output ordering since
-		 * it won't be reflected in the set-op result anyhow; so just label
-		 * the SubqueryScanPath with nil pathkeys.  (XXX that should change
-		 * soon too, likely.)
-		 */
-		path = (Path *) create_subqueryscan_path(root, rel, subpath,
-												 trivial_tlist,
-												 NIL, NULL);
-
-		add_path(rel, path);
-
-		/*
-		 * If we have a partial path for the child relation, we can use that
-		 * to build a partial path for this relation.  But there's no point in
-		 * considering any path but the cheapest.
-		 */
-		if (rel->consider_parallel && bms_is_empty(rel->lateral_relids) &&
-			final_rel->partial_pathlist != NIL)
-		{
-			Path	   *partial_subpath;
-			Path	   *partial_path;
-
-			partial_subpath = linitial(final_rel->partial_pathlist);
-			partial_path = (Path *)
-				create_subqueryscan_path(root, rel, partial_subpath,
-										 trivial_tlist,
-										 NIL, NULL);
-			add_partial_path(rel, partial_path);
-		}
 
 		/*
 		 * Estimate number of groups if caller wants it.  If the subquery used
@@ -333,11 +318,11 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 			if (subquery->groupClause || subquery->groupingSets ||
 				subquery->distinctClause ||
 				subroot->hasHavingQual || subquery->hasAggs)
-				*pNumGroups = subpath->rows;
+				*pNumGroups = final_rel->rows;
 			else
 				*pNumGroups = estimate_num_groups(subroot,
 												  get_tlist_exprs(subroot->parse->targetList, false),
-												  subpath->rows,
+												  final_rel->rows,
 												  NULL,
 												  NULL);
 		}
@@ -386,6 +371,7 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 												*pTargetList,
 												refnames_tlist,
 												&trivial_tlist);
+			*istrivial_tlist = trivial_tlist;
 			target = create_pathtarget(root, *pTargetList);
 
 			/* Apply projection to each path */
@@ -416,16 +402,16 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 				lfirst(lc) = path;
 			}
 		}
+		postprocess_setop_rel(root, rel);
 	}
 	else
 	{
 		elog(ERROR, "unrecognized node type: %d",
 			 (int) nodeTag(setOp));
 		*pTargetList = NIL;
+		rel = NULL;				/* keep compiler quiet */
 	}
 
-	postprocess_setop_rel(root, rel);
-
 	return rel;
 }
 
@@ -444,7 +430,9 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 	Path	   *lpath;
 	Path	   *rpath;
 	List	   *lpath_tlist;
+	bool		lpath_trivial_tlist;
 	List	   *rpath_tlist;
+	bool		rpath_trivial_tlist;
 	List	   *tlist;
 	List	   *groupList;
 	double		dNumGroups;
@@ -464,7 +452,10 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 								  false, -1,
 								  refnames_tlist,
 								  &lpath_tlist,
-								  NULL);
+								  &lpath_trivial_tlist, NULL);
+	if (lrel->rtekind == RTE_SUBQUERY)
+		build_setop_child_paths(root, lrel, lpath_trivial_tlist, lpath_tlist,
+								NIL);
 	lpath = lrel->cheapest_total_path;
 	/* The right path will want to look at the left one ... */
 	root->non_recursive_path = lpath;
@@ -473,7 +464,11 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 								  false, -1,
 								  refnames_tlist,
 								  &rpath_tlist,
+								  &rpath_trivial_tlist,
 								  NULL);
+	if (rrel->rtekind == RTE_SUBQUERY)
+		build_setop_child_paths(root, rrel, rpath_trivial_tlist, rpath_tlist,
+								NIL);
 	rpath = rrel->cheapest_total_path;
 	root->non_recursive_path = NULL;
 
@@ -535,6 +530,166 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 	return result_rel;
 }
 
+/*
+ * build_setop_child_paths
+ *		Build paths for the set op child relation denoted by 'rel'.
+ *
+ * If 'interesting_pathkeys' is non-NIL then also include paths that suit
+ * these pathkeys, sorting any unsorted paths as required.
+ */
+static void
+build_setop_child_paths(PlannerInfo *root, RelOptInfo *rel,
+						bool trivial_tlist, List *child_tlist,
+						List *interesting_pathkeys)
+{
+	RelOptInfo *final_rel;
+	List	   *setop_pathkeys = rel->subroot->setop_pathkeys;
+	ListCell   *lc;
+
+	/* it can't be a set op child rel if it's not a subquery */
+	Assert(rel->rtekind == RTE_SUBQUERY);
+
+	/* when sorting is needed, add child rel equivalences */
+	if (interesting_pathkeys != NIL)
+		add_setop_child_rel_equivalences(root,
+										 rel->relids,
+										 child_tlist,
+										 interesting_pathkeys);
+
+	/*
+	 * Mark rel with estimated output rows, width, etc.  Note that we have to
+	 * do this before generating outer-query paths, else cost_subqueryscan is
+	 * not happy.
+	 */
+	set_subquery_size_estimates(root, rel);
+
+	/*
+	 * Since we may want to add a partial path to this relation, we must set
+	 * its consider_parallel flag correctly.
+	 */
+	final_rel = fetch_upper_rel(rel->subroot, UPPERREL_FINAL, NULL);
+	rel->consider_parallel = final_rel->consider_parallel;
+
+	/* Generate subquery scan paths for any interesting path in final_rel */
+	foreach(lc, final_rel->pathlist)
+	{
+		Path	   *subpath = (Path *) lfirst(lc);
+		List	   *pathkeys;
+		Path	   *cheapest_input_path = final_rel->cheapest_total_path;
+		bool		is_sorted;
+		int			presorted_keys;
+
+		/*
+		 * Include the cheapest path as-is so that the set operation can be
+		 * cheaply implemented using a method which does not require the input
+		 * to be sorted.
+		 */
+		if (subpath == cheapest_input_path)
+		{
+			/* Convert subpath's pathkeys to outer representation */
+			pathkeys = convert_subquery_pathkeys(root, rel, subpath->pathkeys,
+												 make_tlist_from_pathtarget(subpath->pathtarget));
+
+			/* Generate outer path using this subpath */
+			add_path(rel, (Path *) create_subqueryscan_path(root,
+															rel,
+															subpath,
+															trivial_tlist,
+															pathkeys,
+															NULL));
+		}
+
+		/* skip dealing with sorted paths if the setop doesn't need them */
+		if (interesting_pathkeys == NIL)
+			continue;
+
+		/*
+		 * Create paths to suit final sort order required for setop_pathkeys.
+		 * Here we'll sort the cheapest input path (if not sorted already) and
+		 * incremental sort any paths which are partially sorted.
+		 */
+		is_sorted = pathkeys_count_contained_in(setop_pathkeys,
+												subpath->pathkeys,
+												&presorted_keys);
+
+		if (!is_sorted)
+		{
+			double		limittuples = rel->subroot->limit_tuples;
+
+			/*
+			 * Try at least sorting the cheapest path and also try
+			 * incrementally sorting any path which is partially sorted
+			 * already (no need to deal with paths which have presorted keys
+			 * when incremental sort is disabled unless it's the cheapest
+			 * input path).
+			 */
+			if (subpath != cheapest_input_path &&
+				(presorted_keys == 0 || !enable_incremental_sort))
+				continue;
+
+			/*
+			 * We've no need to consider both a sort and incremental sort.
+			 * We'll just do a sort if there are no presorted keys and an
+			 * incremental sort when there are presorted keys.
+			 */
+			if (presorted_keys == 0 || !enable_incremental_sort)
+				subpath = (Path *) create_sort_path(rel->subroot,
+													final_rel,
+													subpath,
+													setop_pathkeys,
+													limittuples);
+			else
+				subpath = (Path *) create_incremental_sort_path(rel->subroot,
+																final_rel,
+																subpath,
+																setop_pathkeys,
+																presorted_keys,
+																limittuples);
+		}
+
+		/* Add the sorted path unless the cheapest path is already sorted. */
+		if (subpath != cheapest_input_path)
+		{
+			/* Convert subpath's pathkeys to outer representation */
+			pathkeys = convert_subquery_pathkeys(root, rel, subpath->pathkeys,
+												 make_tlist_from_pathtarget(subpath->pathtarget));
+
+			/* Generate outer path using this subpath */
+			add_path(rel, (Path *) create_subqueryscan_path(root,
+															rel,
+															subpath,
+															trivial_tlist,
+															pathkeys,
+															NULL));
+		}
+	}
+
+	/* if consider_parallel is false, there should be no partial paths */
+	Assert(final_rel->consider_parallel ||
+		   final_rel->partial_pathlist == NIL);
+
+	/*
+	 * If we have a partial path for the child relation, we can use that
+	 * to build a partial path for this relation.  But there's no point in
+	 * considering any path but the cheapest.
+	 */
+	if (rel->consider_parallel && bms_is_empty(rel->lateral_relids) &&
+		final_rel->partial_pathlist != NIL)
+	{
+		Path	   *partial_subpath;
+		Path	   *partial_path;
+
+		partial_subpath = linitial(final_rel->partial_pathlist);
+		partial_path = (Path *)
+			create_subqueryscan_path(root, rel, partial_subpath,
+									 trivial_tlist,
+									 NIL, NULL);
+		add_partial_path(rel, partial_path);
+	}
+
+	postprocess_setop_rel(root, rel);
+}
+
 /*
  * Generate paths for a UNION or UNION ALL node
  */
@@ -545,30 +700,23 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 {
 	Relids		relids = NULL;
 	RelOptInfo *result_rel;
-	double		save_fraction = root->tuple_fraction;
 	ListCell   *lc;
-	List	   *pathlist = NIL;
+	ListCell   *lc2;
+	ListCell   *lc3;
+	List	   *cheapest_pathlist = NIL;
+	List	   *ordered_pathlist = NIL;
 	List	   *partial_pathlist = NIL;
 	bool		partial_paths_valid = true;
 	bool		consider_parallel = true;
 	List	   *rellist;
 	List	   *tlist_list;
+	List	   *trivial_tlist_list;
 	List	   *tlist;
-	Path	   *path;
-
-	/*
-	 * If plain UNION, tell children to fetch all tuples.
-	 *
-	 * Note: in UNION ALL, we pass the top-level tuple_fraction unmodified to
-	 * each arm of the UNION ALL.  One could make a case for reducing the
-	 * tuple fraction for later arms (discounting by the expected size of the
-	 * earlier arms' results) but it seems not worth the trouble. The normal
-	 * case where tuple_fraction isn't already zero is a LIMIT at top level,
-	 * and passing it down as-is is usually enough to get the desired result
-	 * of preferring fast-start plans.
-	 */
-	if (!op->all)
-		root->tuple_fraction = 0.0;
+	List	   *groupList = NIL;
+	Path	   *apath;
+	Path	   *gpath = NULL;
+	bool		try_sorted;
+	List	   *union_pathkeys = NIL;
 
 	/*
 	 * If any of my children are identical UNION nodes (same op, all-flag, and
@@ -576,7 +724,11 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	 * only one Append and unique-ification for the lot.  Recurse to find such
 	 * nodes and compute their children's paths.
 	 */
-	rellist = plan_union_children(root, op, refnames_tlist, &tlist_list);
+	rellist = plan_union_children(root,
+								  op,
+								  refnames_tlist,
+								  &tlist_list,
+								  &trivial_tlist_list);
 
 	/*
 	 * Generate tlist for Append plan node.
@@ -587,15 +739,68 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	 */
 	tlist = generate_append_tlist(op->colTypes, op->colCollations, false,
 								  tlist_list, refnames_tlist);
-
 	*pTargetList = tlist;
 
+	/* For for UNIONs (not UNION ALL), try sorting, if sorting is possible */
+	try_sorted = !op->all && grouping_is_sortable(op->groupClauses);
+
+	if (try_sorted)
+	{
+		/* Identify the grouping semantics */
+		groupList = generate_setop_grouplist(op, tlist);
+
+		/* Determine the pathkeys for sorting by the whole target list */
+		union_pathkeys = make_pathkeys_for_sortclauses(root, groupList, tlist);
+
+		root->query_pathkeys = union_pathkeys;
+	}
+
+	/*
+	 * Now that we've got the append target list, we can build the union child
+	 * paths.
+	 */
+	forthree(lc, rellist, lc2, trivial_tlist_list, lc3, tlist_list)
+	{
+		RelOptInfo *rel = lfirst(lc);
+		bool		trivial_tlist = lfirst_int(lc2);
+		List	   *child_tlist = lfirst_node(List, lc3);
+
+		/* only build paths for the union children */
+		if (rel->rtekind == RTE_SUBQUERY)
+			build_setop_child_paths(root, rel, trivial_tlist, child_tlist,
+									union_pathkeys);
+	}
+
 	/* Build path lists and relid set. */
 	foreach(lc, rellist)
 	{
 		RelOptInfo *rel = lfirst(lc);
+		Path	   *ordered_path;
 
-		pathlist = lappend(pathlist, rel->cheapest_total_path);
+		cheapest_pathlist = lappend(cheapest_pathlist,
+									rel->cheapest_total_path);
+
+		if (try_sorted)
+		{
+			ordered_path = get_cheapest_path_for_pathkeys(rel->pathlist,
+														  union_pathkeys,
+														  NULL,
+														  TOTAL_COST,
+														  false);
+
+			if (ordered_path != NULL)
+				ordered_pathlist = lappend(ordered_pathlist, ordered_path);
+			else
+			{
+				/*
+				 * If we can't find a sorted path, just give up trying to
+				 * generate a list of correctly sorted child paths.  This can
+				 * happen when type coercion was added to the targetlist due
+				 * to mismatching types from the union children.
+				 */
+				try_sorted = false;
+			}
+		}
 
 		if (consider_parallel)
 		{
@@ -618,28 +823,21 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	result_rel = fetch_upper_rel(root, UPPERREL_SETOP, relids);
 	result_rel->reltarget = create_pathtarget(root, tlist);
 	result_rel->consider_parallel = consider_parallel;
+	result_rel->consider_startup = (root->tuple_fraction > 0);
 
 	/*
-	 * Append the child results together.
+	 * Append the child results together using the cheapest paths from each
+	 * union child.
 	 */
-	path = (Path *) create_append_path(root, result_rel, pathlist, NIL,
-									   NIL, NULL, 0, false, -1);
-
-	/*
-	 * For UNION ALL, we just need the Append path.  For UNION, need to add
-	 * node(s) to remove duplicates.
-	 */
-	if (!op->all)
-		path = make_union_unique(op, path, tlist, root);
-
-	add_path(result_rel, path);
+	apath = (Path *) create_append_path(root, result_rel, cheapest_pathlist,
+										NIL, NIL, NULL, 0, false, -1);
 
 	/*
 	 * Estimate number of groups.  For now we just assume the output is unique
 	 * --- this is certainly true for the UNION case, and we want worst-case
 	 * estimates anyway.
 	 */
-	result_rel->rows = path->rows;
+	result_rel->rows = apath->rows;
 
 	/*
 	 * Now consider doing the same thing using the partial paths plus Append
@@ -647,7 +845,7 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	 */
 	if (partial_paths_valid)
 	{
-		Path	   *ppath;
+		Path	   *papath;
 		int			parallel_workers = 0;
 
 		/* Find the highest number of workers requested for any subpath. */
@@ -676,21 +874,136 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 		}
 		Assert(parallel_workers > 0);
 
-		ppath = (Path *)
+		papath = (Path *)
 			create_append_path(root, result_rel, NIL, partial_pathlist,
-							   NIL, NULL,
-							   parallel_workers, enable_parallel_append,
-							   -1);
-		ppath = (Path *)
-			create_gather_path(root, result_rel, ppath,
+							   NIL, NULL, parallel_workers,
+							   enable_parallel_append, -1);
+		gpath = (Path *)
+			create_gather_path(root, result_rel, papath,
 							   result_rel->reltarget, NULL, NULL);
-		if (!op->all)
-			ppath = make_union_unique(op, ppath, tlist, root);
-		add_path(result_rel, ppath);
 	}
 
-	/* Undo effects of possibly forcing tuple_fraction to 0 */
-	root->tuple_fraction = save_fraction;
+	if (!op->all)
+	{
+		double		dNumGroups;
+		bool		can_sort = grouping_is_sortable(groupList);
+		bool		can_hash = grouping_is_hashable(groupList);
+
+		/*
+		 * XXX for the moment, take the number of distinct groups as equal to
+		 * the total input size, i.e., the worst case.  This is too
+		 * conservative, but it's not clear how to get a decent estimate of
+		 * the true size.  One should note as well the propensity of novices
+		 * to write UNION rather than UNION ALL even when they don't expect
+		 * any duplicates...
+		 */
+		dNumGroups = apath->rows;
+
+		if (can_hash)
+		{
+			Path	   *path;
+
+			/*
+			 * Try a hash aggregate plan on 'apath'.  This is the cheapest
+			 * available path containing each append child.
+			 */
+			path = (Path *) create_agg_path(root,
+											result_rel,
+											apath,
+											create_pathtarget(root, tlist),
+											AGG_HASHED,
+											AGGSPLIT_SIMPLE,
+											groupList,
+											NIL,
+											NULL,
+											dNumGroups);
+			add_path(result_rel, path);
+
+			/* Try hash aggregate on the Gather path, if valid */
+			if (gpath != NULL)
+			{
+				/* Hashed aggregate plan --- no sort needed */
+				path = (Path *) create_agg_path(root,
+												result_rel,
+												gpath,
+												create_pathtarget(root, tlist),
+												AGG_HASHED,
+												AGGSPLIT_SIMPLE,
+												groupList,
+												NIL,
+												NULL,
+												dNumGroups);
+				add_path(result_rel, path);
+			}
+		}
+
+		if (can_sort)
+		{
+			Path	   *path = apath;
+
+			if (groupList != NIL)
+				path = (Path *) create_sort_path(root, result_rel, path,
+												 make_pathkeys_for_sortclauses(root, groupList, tlist),
+												 -1.0);
+
+			path = (Path *) create_upper_unique_path(root,
+													 result_rel,
+													 path,
+													 list_length(path->pathkeys),
+													 dNumGroups);
+
+			add_path(result_rel, path);
+
+			/* Try Sort -> Unique on the Gather path, if set */
+			if (gpath != NULL)
+			{
+				path = gpath;
+
+				path = (Path *) create_sort_path(root, result_rel, path,
+												 make_pathkeys_for_sortclauses(root, groupList, tlist),
+												 -1.0);
+
+				path = (Path *) create_upper_unique_path(root,
+														 result_rel,
+														 path,
+														 list_length(path->pathkeys),
+														 dNumGroups);
+				add_path(result_rel, path);
+			}
+		}
+
+		/*
+		 * Try making a MergeAppend path if we managed to find a path with the
+		 * correct pathkeys in each union child query.
+		 */
+		if (try_sorted && groupList != NIL)
+		{
+			Path	   *path;
+
+			path = (Path *) create_merge_append_path(root,
+													 result_rel,
+													 ordered_pathlist,
+													 union_pathkeys,
+													 NULL);
+
+			/* and make the MergeAppend unique */
+			path = (Path *) create_upper_unique_path(root,
+													 result_rel,
+													 path,
+													 list_length(tlist),
+													 dNumGroups);
+
+			add_path(result_rel, path);
+		}
+	}
+	else
+	{
+		/* UNION ALL */
+		add_path(result_rel, apath);
+
+		if (gpath != NULL)
+			add_path(result_rel, gpath);
+	}
 
 	return result_rel;
 }
@@ -716,6 +1029,8 @@ generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *root,
 			   *tlist,
 			   *groupList,
 			   *pathlist;
+	bool		lpath_trivial_tlist,
+				rpath_trivial_tlist;
 	double		dLeftGroups,
 				dRightGroups,
 				dNumGroups,
@@ -735,14 +1050,22 @@ generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *root,
 								  false, 0,
 								  refnames_tlist,
 								  &lpath_tlist,
+								  &lpath_trivial_tlist,
 								  &dLeftGroups);
+	if (lrel->rtekind == RTE_SUBQUERY)
+		build_setop_child_paths(root, lrel, lpath_trivial_tlist, lpath_tlist,
+								NIL);
 	lpath = lrel->cheapest_total_path;
 	rrel = recurse_set_operations(op->rarg, root,
 								  op->colTypes, op->colCollations,
 								  false, 1,
 								  refnames_tlist,
 								  &rpath_tlist,
+								  &rpath_trivial_tlist,
 								  &dRightGroups);
+	if (rrel->rtekind == RTE_SUBQUERY)
+		build_setop_child_paths(root, rrel, rpath_trivial_tlist, rpath_tlist,
+								NIL);
 	rpath = rrel->cheapest_total_path;
 
 	/* Undo effects of forcing tuple_fraction to 0 */
@@ -879,13 +1202,16 @@ static List *
 plan_union_children(PlannerInfo *root,
 					SetOperationStmt *top_union,
 					List *refnames_tlist,
-					List **tlist_list)
+					List **tlist_list,
+					List **istrivial_tlist)
 {
 	List	   *pending_rels = list_make1(top_union);
 	List	   *result = NIL;
 	List	   *child_tlist;
+	bool		trivial_tlist;
 
 	*tlist_list = NIL;
+	*istrivial_tlist = NIL;
 
 	while (pending_rels != NIL)
 	{
@@ -924,75 +1250,15 @@ plan_union_children(PlannerInfo *root,
 														false, -1,
 														refnames_tlist,
 														&child_tlist,
+														&trivial_tlist,
 														NULL));
 		*tlist_list = lappend(*tlist_list, child_tlist);
+		*istrivial_tlist = lappend_int(*istrivial_tlist, trivial_tlist);
 	}
 
 	return result;
 }
 
-/*
- * Add nodes to the given path tree to unique-ify the result of a UNION.
- */
-static Path *
-make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
-				  PlannerInfo *root)
-{
-	RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
-	List	   *groupList;
-	double		dNumGroups;
-
-	/* Identify the grouping semantics */
-	groupList = generate_setop_grouplist(op, tlist);
-
-	/*
-	 * XXX for the moment, take the number of distinct groups as equal to the
-	 * total input size, ie, the worst case.  This is too conservative, but
-	 * it's not clear how to get a decent estimate of the true size.  One
-	 * should note as well the propensity of novices to write UNION rather
-	 * than UNION ALL even when they don't expect any duplicates...
-	 */
-	dNumGroups = path->rows;
-
-	/* Decide whether to hash or sort */
-	if (choose_hashed_setop(root, groupList, path,
-							dNumGroups, dNumGroups,
-							"UNION"))
-	{
-		/* Hashed aggregate plan --- no sort needed */
-		path = (Path *) create_agg_path(root,
-										result_rel,
-										path,
-										create_pathtarget(root, tlist),
-										AGG_HASHED,
-										AGGSPLIT_SIMPLE,
-										groupList,
-										NIL,
-										NULL,
-										dNumGroups);
-	}
-	else
-	{
-		/* Sort and Unique */
-		if (groupList)
-			path = (Path *)
-				create_sort_path(root,
-								 result_rel,
-								 path,
-								 make_pathkeys_for_sortclauses(root,
-															   groupList,
-															   tlist),
-								 -1.0);
-		path = (Path *) create_upper_unique_path(root,
-												 result_rel,
-												 path,
-												 list_length(path->pathkeys),
-												 dNumGroups);
-	}
-
-	return path;
-}
-
 /*
  * postprocess_setop_rel - perform steps required after adding paths
  */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 534692bee1..b3069516b2 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -397,6 +397,8 @@ struct PlannerInfo
 	List	   *distinct_pathkeys;
 	/* sortClause pathkeys, if any */
 	List	   *sort_pathkeys;
+	/* set operator pathkeys, if any */
+	List	   *setop_pathkeys;
 
 	/* Canonicalised partition schemes used in the query. */
 	List	   *part_schemes pg_node_attr(read_write_ignore);
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 0e8a9c94ba..1165e98c5b 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -174,6 +174,9 @@ extern void add_child_join_rel_equivalences(PlannerInfo *root,
 											AppendRelInfo **appinfos,
 											RelOptInfo *parent_joinrel,
 											RelOptInfo *child_joinrel);
+extern void add_setop_child_rel_equivalences(PlannerInfo *root, Relids relids,
+											 List *tlist,
+											 List *setop_pathkeys);
 extern List *generate_implied_equalities_for_column(PlannerInfo *root,
 													RelOptInfo *rel,
 													ec_matches_callback_type callback,
diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h
index 8e00716dc8..a52dec285d 100644
--- a/src/include/optimizer/prep.h
+++ b/src/include/optimizer/prep.h
@@ -53,6 +53,6 @@ extern void preprocess_aggrefs(PlannerInfo *root, Node *clause);
  * prototypes for prepunion.c
  */
 extern RelOptInfo *plan_set_operations(PlannerInfo *root);
-
+extern bool set_operation_ordered_results_useful(SetOperationStmt *setop);
 
 #endif							/* PREP_H */
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index 8ca93f4dea..4b8c8f143f 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -1396,6 +1396,7 @@ SELECT x FROM test3cs WHERE x ~ 'a';
  abc
 (1 row)
 
+SET enable_hashagg TO off;
 SELECT x FROM test1cs UNION SELECT x FROM test2cs ORDER BY x;
   x  
 -----
@@ -1448,6 +1449,7 @@ SELECT DISTINCT x FROM test3cs ORDER BY x;
  ghi
 (4 rows)
 
+RESET enable_hashagg;
 SELECT count(DISTINCT x) FROM test3cs;
  count 
 -------
diff --git a/src/test/regress/expected/incremental_sort.out b/src/test/regress/expected/incremental_sort.out
index 7fdb685313..5fd54a10b1 100644
--- a/src/test/regress/expected/incremental_sort.out
+++ b/src/test/regress/expected/incremental_sort.out
@@ -1472,14 +1472,19 @@ explain (costs off) select * from t union select * from t order by 1,3;
    Sort Key: t.a, t.c
    Presorted Key: t.a
    ->  Unique
-         ->  Sort
+         ->  Merge Append
                Sort Key: t.a, t.b, t.c
-               ->  Gather
+               ->  Gather Merge
                      Workers Planned: 2
-                     ->  Parallel Append
+                     ->  Sort
+                           Sort Key: t.a, t.b, t.c
                            ->  Parallel Seq Scan on t
+               ->  Gather Merge
+                     Workers Planned: 2
+                     ->  Sort
+                           Sort Key: t_1.a, t_1.b, t_1.c
                            ->  Parallel Seq Scan on t t_1
-(11 rows)
+(16 rows)
 
 -- Full sort, not just incremental sort can be pushed below a gather merge path
 -- by generate_useful_gather_paths.
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 882017afc9..c31c256b86 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -370,16 +370,16 @@ select count(*) from
 explain (costs off)
 select count(*) from
   ( select unique1 from tenk1 intersect select fivethous from tenk1 ) ss;
-                                     QUERY PLAN                                     
-------------------------------------------------------------------------------------
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
  Aggregate
    ->  Subquery Scan on ss
          ->  HashSetOp Intersect
                ->  Append
-                     ->  Subquery Scan on "*SELECT* 2"
-                           ->  Seq Scan on tenk1
                      ->  Subquery Scan on "*SELECT* 1"
-                           ->  Index Only Scan using tenk1_unique1 on tenk1 tenk1_1
+                           ->  Index Only Scan using tenk1_unique1 on tenk1
+                     ->  Subquery Scan on "*SELECT* 2"
+                           ->  Seq Scan on tenk1 tenk1_1
 (8 rows)
 
 select count(*) from
@@ -412,16 +412,17 @@ set enable_hashagg to off;
 explain (costs off)
 select count(*) from
   ( select unique1 from tenk1 union select fivethous from tenk1 ) ss;
-                              QUERY PLAN                              
-----------------------------------------------------------------------
+                           QUERY PLAN                           
+----------------------------------------------------------------
  Aggregate
    ->  Unique
-         ->  Sort
+         ->  Merge Append
                Sort Key: tenk1.unique1
-               ->  Append
-                     ->  Index Only Scan using tenk1_unique1 on tenk1
+               ->  Index Only Scan using tenk1_unique1 on tenk1
+               ->  Sort
+                     Sort Key: tenk1_1.fivethous
                      ->  Seq Scan on tenk1 tenk1_1
-(7 rows)
+(8 rows)
 
 select count(*) from
   ( select unique1 from tenk1 union select fivethous from tenk1 ) ss;
@@ -433,18 +434,18 @@ select count(*) from
 explain (costs off)
 select count(*) from
   ( select unique1 from tenk1 intersect select fivethous from tenk1 ) ss;
-                                        QUERY PLAN                                        
-------------------------------------------------------------------------------------------
+                                    QUERY PLAN                                    
+----------------------------------------------------------------------------------
  Aggregate
    ->  Subquery Scan on ss
          ->  SetOp Intersect
                ->  Sort
-                     Sort Key: "*SELECT* 2".fivethous
+                     Sort Key: "*SELECT* 1".unique1
                      ->  Append
-                           ->  Subquery Scan on "*SELECT* 2"
-                                 ->  Seq Scan on tenk1
                            ->  Subquery Scan on "*SELECT* 1"
-                                 ->  Index Only Scan using tenk1_unique1 on tenk1 tenk1_1
+                                 ->  Index Only Scan using tenk1_unique1 on tenk1
+                           ->  Subquery Scan on "*SELECT* 2"
+                                 ->  Seq Scan on tenk1 tenk1_1
 (10 rows)
 
 select count(*) from
@@ -950,16 +951,8 @@ select except select;
 -- check hashed implementation
 set enable_hashagg = true;
 set enable_sort = false;
-explain (costs off)
-select from generate_series(1,5) union select from generate_series(1,3);
-                           QUERY PLAN                           
-----------------------------------------------------------------
- HashAggregate
-   ->  Append
-         ->  Function Scan on generate_series
-         ->  Function Scan on generate_series generate_series_1
-(4 rows)
-
+-- We've no way to check hashed UNION as the empty pathkeys in the Append are
+-- fine to make use of Unique, which is cheaper than HashAggregate.
 explain (costs off)
 select from generate_series(1,5) intersect select from generate_series(1,3);
                               QUERY PLAN                              
@@ -972,10 +965,6 @@ select from generate_series(1,5) intersect select from generate_series(1,3);
                ->  Function Scan on generate_series generate_series_1
 (6 rows)
 
-select from generate_series(1,5) union select from generate_series(1,3);
---
-(1 row)
-
 select from generate_series(1,5) union all select from generate_series(1,3);
 --
 (8 rows)
@@ -1102,16 +1091,17 @@ explain (costs off)
   UNION
   SELECT * FROM t2) t
  WHERE ab = 'ab';
-                    QUERY PLAN                     
----------------------------------------------------
- HashAggregate
-   Group Key: ((t1.a || t1.b))
-   ->  Append
-         ->  Index Scan using t1_ab_idx on t1
-               Index Cond: ((a || b) = 'ab'::text)
-         ->  Index Only Scan using t2_pkey on t2
-               Index Cond: (ab = 'ab'::text)
-(7 rows)
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: ((t1.a || t1.b))
+         ->  Append
+               ->  Index Scan using t1_ab_idx on t1
+                     Index Cond: ((a || b) = 'ab'::text)
+               ->  Index Only Scan using t2_pkey on t2
+                     Index Cond: (ab = 'ab'::text)
+(8 rows)
 
 --
 -- Test that ORDER BY for UNION ALL can be pushed down to inheritance
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index 03837de846..80f28a97d7 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -555,6 +555,7 @@ SELECT x FROM test3cs WHERE x LIKE 'a%';
 SELECT x FROM test3cs WHERE x ILIKE 'a%';
 SELECT x FROM test3cs WHERE x SIMILAR TO 'a%';
 SELECT x FROM test3cs WHERE x ~ 'a';
+SET enable_hashagg TO off;
 SELECT x FROM test1cs UNION SELECT x FROM test2cs ORDER BY x;
 SELECT x FROM test2cs UNION SELECT x FROM test1cs ORDER BY x;
 SELECT x FROM test1cs INTERSECT SELECT x FROM test2cs;
@@ -562,6 +563,7 @@ SELECT x FROM test2cs INTERSECT SELECT x FROM test1cs;
 SELECT x FROM test1cs EXCEPT SELECT x FROM test2cs;
 SELECT x FROM test2cs EXCEPT SELECT x FROM test1cs;
 SELECT DISTINCT x FROM test3cs ORDER BY x;
+RESET enable_hashagg;
 SELECT count(DISTINCT x) FROM test3cs;
 SELECT x, count(*) FROM test3cs GROUP BY x ORDER BY x;
 SELECT x, row_number() OVER (ORDER BY x), rank() OVER (ORDER BY x) FROM test3cs ORDER BY x;
diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql
index d160db5458..5913dca8df 100644
--- a/src/test/regress/sql/union.sql
+++ b/src/test/regress/sql/union.sql
@@ -302,12 +302,11 @@ select except select;
 set enable_hashagg = true;
 set enable_sort = false;
 
-explain (costs off)
-select from generate_series(1,5) union select from generate_series(1,3);
+-- We've no way to check hashed UNION as the empty pathkeys in the Append are
+-- fine to make use of Unique, which is cheaper than HashAggregate.
 explain (costs off)
 select from generate_series(1,5) intersect select from generate_series(1,3);
 
-select from generate_series(1,5) union select from generate_series(1,3);
 select from generate_series(1,5) union all select from generate_series(1,3);
 select from generate_series(1,5) intersect select from generate_series(1,3);
 select from generate_series(1,5) intersect all select from generate_series(1,3);

#15

David Rowley

dgrowleyml@gmail.com

almost 2 years ago

In reply to: David Rowley (#14)

Re: Properly pathify the union planner

On Tue, 12 Mar 2024 at 23:21, David Rowley <dgrowleyml@gmail.com> wrote:

I've attached v3.

I spent quite a bit more time looking at this.

I discovered that the dNumGroups wasn't being set as it should have
been for INTERSECT and EXCEPT queries. There was a plan change as a
result of this. I've fixed this by adjusting where dNumGroups is set.
It must be delayed until after the setop child paths have been
created.

Aside from this, the changes I made were mostly cosmetic. However, I
did notice that I wasn't setting the union child RelOptInfo's
ec_indexes in add_setop_child_rel_equivalences(). I also discovered
that we're not doing that properly for the top-level RelOptInfo for
the UNION query prior to this change. The reason is that due to the
Var.varno==0 for the top-level UNION targetlist. The code in
get_eclass_for_sort_expr() effectively misses this relation due to
"while ((i = bms_next_member(newec->ec_relids, i)) > 0)". This
happens to be good because there is no root->simple_rel_array[0]
entry, so happens to prevent that code crashing. It seems ok that
the ec_indexes are not set for the top-level set RelOptInfo as
get_eclass_for_sort_expr() does not make use of ec_indexes while
searching for an existing EquivalenceClass. Maybe we should fix this
varno == 0 hack and adjust get_eclass_for_sort_expr() so that it makes
use of the ec_indexes.

It's possible to see this happen with a query such as:

SELECT oid FROM pg_class UNION SELECT oid FROM pg_class ORDER BY oid;

I didn't see that as a reason not to push this patch as this occurs
both with and without this change, so I've now pushed this patch.

Thank you and Andy for reviewing this.

David

#16

Richard Guo

guofenglinux@gmail.com

almost 2 years ago

In reply to: David Rowley (#15)

Re: Properly pathify the union planner

On Mon, Mar 25, 2024 at 9:44 AM David Rowley <dgrowleyml@gmail.com> wrote:

It seems ok that
the ec_indexes are not set for the top-level set RelOptInfo as
get_eclass_for_sort_expr() does not make use of ec_indexes while
searching for an existing EquivalenceClass. Maybe we should fix this
varno == 0 hack and adjust get_eclass_for_sort_expr() so that it makes
use of the ec_indexes.

It's possible to see this happen with a query such as:

SELECT oid FROM pg_class UNION SELECT oid FROM pg_class ORDER BY oid;

I see what you said. Yeah, there might be some optimization
possibilities in this area. And I agree that this should not be a
blocker in pushing this patch.

I didn't see that as a reason not to push this patch as this occurs
both with and without this change, so I've now pushed this patch.

Great to see this patch has been pushed!

Thanks
Richard

#17

Alexander Lakhin

exclusion@gmail.com

almost 2 years ago

In reply to: David Rowley (#15)

Re: Properly pathify the union planner

Hello David,

25.03.2024 04:43, David Rowley wrote:

I didn't see that as a reason not to push this patch as this occurs
both with and without this change, so I've now pushed this patch.

Please look at a new assertion failure, that is triggered by the following
query:
SELECT count(*) FROM (
WITH q1(x) AS (SELECT 1)
SELECT FROM q1 UNION SELECT FROM q1
) qu;

TRAP: failed Assert("lg != NULL"), File: "planner.c", Line: 7941, PID: 1133017

Best regards,
Alexander

#18

David Rowley

dgrowleyml@gmail.com

almost 2 years ago

In reply to: Alexander Lakhin (#17)

Re: Properly pathify the union planner

On Wed, 27 Mar 2024 at 06:00, Alexander Lakhin <exclusion@gmail.com> wrote:

SELECT count(*) FROM (
WITH q1(x) AS (SELECT 1)
SELECT FROM q1 UNION SELECT FROM q1
) qu;

TRAP: failed Assert("lg != NULL"), File: "planner.c", Line: 7941, PID: 1133017

Thanks for finding that.

There's something weird going on with the UNION child subquery's
setOperations field. As far as I understand, and from reading the
existing comments, this should only be set for the top-level union.

Because this field is set, it plans the CTE thinking it's a UNION
child and breaks when it can't find a SortGroupClause for the CTE's
target list item.

I'll keep digging. As far as I see the setOperations field is only set
in transformSetOperationStmt(). I'm guessing we must be doing a
copyObject() somewhere and accidentally picking up the parent's
setOperations.

David

#19

Richard Guo

guofenglinux@gmail.com

almost 2 years ago

In reply to: David Rowley (#18)

Re: Properly pathify the union planner

On Wed, Mar 27, 2024 at 6:23 AM David Rowley <dgrowleyml@gmail.com> wrote:

Because this field is set, it plans the CTE thinking it's a UNION
child and breaks when it can't find a SortGroupClause for the CTE's
target list item.

Right. The problem here is that we mistakenly think that the CTE query
is a subquery for the set operation and thus store the SetOperationStmt
in its qp_extra. Currently the code for the check is:

/*
* Check if we're a subquery for a set operation. If we are, store
* the SetOperationStmt in qp_extra.
*/
if (root->parent_root != NULL &&
root->parent_root->parse->setOperations != NULL &&
IsA(root->parent_root->parse->setOperations, SetOperationStmt))
qp_extra.setop =
(SetOperationStmt *) root->parent_root->parse->setOperations;
else
qp_extra.setop = NULL;

This check cannot tell if the subquery is for a set operation or a CTE,
because its parent might have setOperations set in both cases. Hmm, is
there any way to differentiate between the two?

Thanks
Richard

#20

David Rowley

dgrowleyml@gmail.com

almost 2 years ago

In reply to: Richard Guo (#19)

Re: Properly pathify the union planner

On Wed, 27 Mar 2024 at 16:15, Richard Guo <guofenglinux@gmail.com> wrote:

if (root->parent_root != NULL &&
root->parent_root->parse->setOperations != NULL &&
IsA(root->parent_root->parse->setOperations, SetOperationStmt))
qp_extra.setop =
(SetOperationStmt *) root->parent_root->parse->setOperations;
else
qp_extra.setop = NULL;

This check cannot tell if the subquery is for a set operation or a CTE,
because its parent might have setOperations set in both cases. Hmm, is
there any way to differentiate between the two?

As far as I see, there's nothing to go on... well unless you counted
canSetTag, which is false for the CTE (per analyzeCTE())... but that's
certainly not the fix.

I did wonder when first working on this patch if subquery_planner()
should grow an extra parameter, or maybe consolidate some existing
ones by passing some struct that provides the planner with a bit more
context about the query. A few of the existing parameters are likely
candidates for being in such a struct. e.g. hasRecursion and
tuple_fraction. A SetOperationStmt could go in there too.

The other CTE thread about the PathKey change you worked on highlights
that something like this could be useful. I posted in [1]/messages/by-id/CAApHDvrF53ErmonnpO77eDiJm7PyReZ+nD=4FSsSOmaKx6+MuQ@mail.gmail.com about this.

David

[1]: /messages/by-id/CAApHDvrF53ErmonnpO77eDiJm7PyReZ+nD=4FSsSOmaKx6+MuQ@mail.gmail.com

#21

David Rowley

dgrowleyml@gmail.com

almost 2 years ago

In reply to: David Rowley (#20)

1 attachment(s)

Re: Properly pathify the union planner

On Wed, 27 Mar 2024 at 22:47, David Rowley <dgrowleyml@gmail.com> wrote:

I did wonder when first working on this patch if subquery_planner()
should grow an extra parameter, or maybe consolidate some existing
ones by passing some struct that provides the planner with a bit more
context about the query. A few of the existing parameters are likely
candidates for being in such a struct. e.g. hasRecursion and
tuple_fraction. A SetOperationStmt could go in there too.

The attached is roughly what I had in mind. I've not taken the time
to see what comments need to be updated, so the attached aims only to
assist discussion.

David

Attachments:

add_PlannerContext.patchtext/plain; charset=US-ASCII; name=add_PlannerContext.patchDownload

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 7bad404458..26cbe2c9a2 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -2494,6 +2494,7 @@ static void
 set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 					  Index rti, RangeTblEntry *rte)
 {
+	PlannerContext context;
 	Query	   *parse = root->parse;
 	Query	   *subquery = rte->subquery;
 	bool		trivial_pathtarget;
@@ -2644,10 +2645,12 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	/* plan_params should not be in use in current query level */
 	Assert(root->plan_params == NIL);
 
+	context.hasRecursion = false;
+	context.tuple_fraction = tuple_fraction;
+	context.setops = NULL;
+
 	/* Generate a subroot and Paths for the subquery */
-	rel->subroot = subquery_planner(root->glob, subquery,
-									root,
-									false, tuple_fraction);
+	rel->subroot = subquery_planner(root->glob, subquery, root, &context);
 
 	/* Isolate the params needed by this specific subplan */
 	rel->subplan_params = root->plan_params;
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 38d070fa00..56fa445ae8 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -127,7 +127,7 @@ typedef struct
 /* Local functions */
 static Node *preprocess_expression(PlannerInfo *root, Node *expr, int kind);
 static void preprocess_qual_conditions(PlannerInfo *root, Node *jtnode);
-static void grouping_planner(PlannerInfo *root, double tuple_fraction);
+static void grouping_planner(PlannerInfo *root, PlannerContext *context);
 static grouping_sets_data *preprocess_grouping_sets(PlannerInfo *root);
 static List *remap_to_groupclause_idx(List *groupClause, List *gsets,
 									  int *tleref_to_colnum_map);
@@ -288,6 +288,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 {
 	PlannedStmt *result;
 	PlannerGlobal *glob;
+	PlannerContext context;
 	double		tuple_fraction;
 	PlannerInfo *root;
 	RelOptInfo *final_rel;
@@ -410,9 +411,12 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 		tuple_fraction = 0.0;
 	}
 
+	context.hasRecursion = false;
+	context.tuple_fraction = tuple_fraction;
+	context.setops = NULL;
+
 	/* primary planning entry point (may recurse for subqueries) */
-	root = subquery_planner(glob, parse, NULL,
-							false, tuple_fraction);
+	root = subquery_planner(glob, parse, NULL, &context);
 
 	/* Select best Path and turn it into a Plan */
 	final_rel = fetch_upper_rel(root, UPPERREL_FINAL, NULL);
@@ -600,9 +604,6 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
  * glob is the global state for the current planner run.
  * parse is the querytree produced by the parser & rewriter.
  * parent_root is the immediate parent Query's info (NULL at the top level).
- * hasRecursion is true if this is a recursive WITH query.
- * tuple_fraction is the fraction of tuples we expect will be retrieved.
- * tuple_fraction is interpreted as explained for grouping_planner, below.
  *
  * Basically, this routine does the stuff that should only be done once
  * per Query object.  It then calls grouping_planner.  At one time,
@@ -623,7 +624,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 PlannerInfo *
 subquery_planner(PlannerGlobal *glob, Query *parse,
 				 PlannerInfo *parent_root,
-				 bool hasRecursion, double tuple_fraction)
+				 PlannerContext *context)
 {
 	PlannerInfo *root;
 	List	   *newWithCheckOptions;
@@ -667,8 +668,8 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
 	root->hasPseudoConstantQuals = false;
 	root->hasAlternativeSubPlans = false;
 	root->placeholdersFrozen = false;
-	root->hasRecursion = hasRecursion;
-	if (hasRecursion)
+	root->hasRecursion = context->hasRecursion;
+	if (context->hasRecursion)
 		root->wt_param_id = assign_special_exec_param(root);
 	else
 		root->wt_param_id = -1;
@@ -1082,7 +1083,7 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
 	/*
 	 * Do the main planning.
 	 */
-	grouping_planner(root, tuple_fraction);
+	grouping_planner(root, context);
 
 	/*
 	 * Capture the set of outer-level param IDs we have access to, for use in
@@ -1274,14 +1275,6 @@ preprocess_phv_expression(PlannerInfo *root, Expr *expr)
  * This function adds all required top-level processing to the scan/join
  * Path(s) produced by query_planner.
  *
- * tuple_fraction is the fraction of tuples we expect will be retrieved.
- * tuple_fraction is interpreted as follows:
- *	  0: expect all tuples to be retrieved (normal case)
- *	  0 < tuple_fraction < 1: expect the given fraction of tuples available
- *		from the plan to be retrieved
- *	  tuple_fraction >= 1: tuple_fraction is the absolute number of tuples
- *		expected to be retrieved (ie, a LIMIT specification)
- *
  * Returns nothing; the useful output is in the Paths we attach to the
  * (UPPERREL_FINAL, NULL) upperrel in *root.  In addition,
  * root->processed_tlist contains the final processed targetlist.
@@ -1291,7 +1284,7 @@ preprocess_phv_expression(PlannerInfo *root, Expr *expr)
  *--------------------
  */
 static void
-grouping_planner(PlannerInfo *root, double tuple_fraction)
+grouping_planner(PlannerInfo *root, PlannerContext *context)
 {
 	Query	   *parse = root->parse;
 	int64		offset_est = 0;
@@ -1305,6 +1298,7 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 	RelOptInfo *current_rel;
 	RelOptInfo *final_rel;
 	FinalPathExtraData extra;
+	double		tuple_fraction = context->tuple_fraction;
 	ListCell   *lc;
 
 	/* Tweak caller-supplied tuple_fraction if have LIMIT/OFFSET */
@@ -1507,16 +1501,10 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 		qp_extra.gset_data = gset_data;
 
 		/*
-		 * Check if we're a subquery for a set operation.  If we are, store
-		 * the SetOperationStmt in qp_extra.
+		 * If we're a subquery for a set operation, store the SetOperationStmt in
+		 * qp_extra.
 		 */
-		if (root->parent_root != NULL &&
-			root->parent_root->parse->setOperations != NULL &&
-			IsA(root->parent_root->parse->setOperations, SetOperationStmt))
-			qp_extra.setop =
-				(SetOperationStmt *) root->parent_root->parse->setOperations;
-		else
-			qp_extra.setop = NULL;
+		qp_extra.setop = (SetOperationStmt *) context->setops;
 
 		/*
 		 * Generate the best unsorted and presorted paths for the scan/join
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index d5fa281b10..49be10e29b 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -167,6 +167,7 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 	bool		simple_exists = false;
 	double		tuple_fraction;
 	PlannerInfo *subroot;
+	PlannerContext context;
 	RelOptInfo *final_rel;
 	Path	   *best_path;
 	Plan	   *plan;
@@ -217,10 +218,12 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 	/* plan_params should not be in use in current query level */
 	Assert(root->plan_params == NIL);
 
+	context.hasRecursion = false;
+	context.tuple_fraction = tuple_fraction;
+	context.setops = NULL;
+
 	/* Generate Paths for the subquery */
-	subroot = subquery_planner(root->glob, subquery,
-							   root,
-							   false, tuple_fraction);
+	subroot = subquery_planner(root->glob, subquery, root, &context);
 
 	/* Isolate the params needed by this specific subplan */
 	plan_params = root->plan_params;
@@ -265,10 +268,14 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 										 &newtestexpr, &paramIds);
 		if (subquery)
 		{
+			PlannerContext context;
+
+			context.hasRecursion = false;
+			context.tuple_fraction = 0.0;
+			context.setops = NULL;
+
 			/* Generate Paths for the ANY subquery; we'll need all rows */
-			subroot = subquery_planner(root->glob, subquery,
-									   root,
-									   false, 0.0);
+			subroot = subquery_planner(root->glob, subquery, root, &context);
 
 			/* Isolate the params needed by this specific subplan */
 			plan_params = root->plan_params;
@@ -891,6 +898,7 @@ SS_process_ctes(PlannerInfo *root)
 		CmdType		cmdType = ((Query *) cte->ctequery)->commandType;
 		Query	   *subquery;
 		PlannerInfo *subroot;
+		PlannerContext context;
 		RelOptInfo *final_rel;
 		Path	   *best_path;
 		Plan	   *plan;
@@ -963,13 +971,16 @@ SS_process_ctes(PlannerInfo *root)
 		/* plan_params should not be in use in current query level */
 		Assert(root->plan_params == NIL);
 
+		context.hasRecursion = cte->cterecursive;
 		/*
-		 * Generate Paths for the CTE query.  Always plan for full retrieval
-		 * --- we don't have enough info to predict otherwise.
+		 * Always plan for full retrieval --- we don't have enough info to
+		 * predict otherwise.
 		 */
-		subroot = subquery_planner(root->glob, subquery,
-								   root,
-								   cte->cterecursive, 0.0);
+		context.tuple_fraction = 0.0;
+		context.setops = NULL;
+
+		/* Generate Paths for the CTE query. */
+		subroot = subquery_planner(root->glob, subquery, root, &context);
 
 		/*
 		 * Since the current query level doesn't yet contain any RTEs, it
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 944afc7192..113c721602 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -242,6 +242,7 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 
 	if (IsA(setOp, RangeTblRef))
 	{
+		PlannerContext context;
 		RangeTblRef *rtr = (RangeTblRef *) setOp;
 		RangeTblEntry *rte = root->simple_rte_array[rtr->rtindex];
 		Query	   *subquery = rte->subquery;
@@ -257,11 +258,14 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		/* plan_params should not be in use in current query level */
 		Assert(root->plan_params == NIL);
 
+		context.hasRecursion = false;
+		context.tuple_fraction = root->tuple_fraction;
+		context.setops = castNode(SetOperationStmt,
+								  root->parse->setOperations);
+
 		/* Generate a subroot and Paths for the subquery */
-		subroot = rel->subroot = subquery_planner(root->glob, subquery,
-												  root,
-												  false,
-												  root->tuple_fraction);
+		subroot = rel->subroot = subquery_planner(root->glob, subquery, root,
+												  &context);
 
 		/*
 		 * It should not be possible for the primitive query to contain any
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 595eec2cbb..a3727ba2f1 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -551,6 +551,37 @@ struct PlannerInfo
 	bool		partColsUpdated;
 };
 
+/*----------
+ * PlannerContext
+ *		Per-query context to provide additional information to
+ *		subquery_planner() and grouping_planner().
+ */
+typedef struct PlannerContext
+{
+	/*
+	 * hasRecursion: must be given as true if planning recursive WITH query.
+	 */
+	bool hasRecursion;
+
+	/*--------------------
+	 * tuple_fraction is the fraction of tuples we expect will be retrieved.
+	 * tuple_fraction is interpreted as follows:
+	 *	  0: expect all tuples to be retrieved (normal case)
+	 *	  0 < tuple_fraction < 1: expect the given fraction of tuples available
+	 *		from the plan to be retrieved
+	 *	  tuple_fraction >= 1: tuple_fraction is the absolute number of tuples
+	 *		expected to be retrieved (ie, a LIMIT specification)
+	 *--------------------
+	 */
+	double tuple_fraction;
+
+	/*
+	 * Can be set for set operation child queries to provide information on
+	 * the parent-level operation to allow them to optimize for this context
+	 */
+	SetOperationStmt *setops;
+
+} PlannerContext;
 
 /*
  * In places where it's known that simple_rte_array[] must have been prepared
diff --git a/src/include/optimizer/planner.h b/src/include/optimizer/planner.h
index e1d79ffdf3..e3c9a4e3ae 100644
--- a/src/include/optimizer/planner.h
+++ b/src/include/optimizer/planner.h
@@ -44,7 +44,7 @@ extern PlannedStmt *standard_planner(Query *parse, const char *query_string,
 
 extern PlannerInfo *subquery_planner(PlannerGlobal *glob, Query *parse,
 									 PlannerInfo *parent_root,
-									 bool hasRecursion, double tuple_fraction);
+									 PlannerContext *context);
 
 extern RowMarkType select_rowmark_type(RangeTblEntry *rte,
 									   LockClauseStrength strength);

#22

Richard Guo

guofenglinux@gmail.com

almost 2 years ago

In reply to: David Rowley (#21)

Re: Properly pathify the union planner

On Wed, Mar 27, 2024 at 6:34 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Wed, 27 Mar 2024 at 22:47, David Rowley <dgrowleyml@gmail.com> wrote:

I did wonder when first working on this patch if subquery_planner()
should grow an extra parameter, or maybe consolidate some existing
ones by passing some struct that provides the planner with a bit more
context about the query. A few of the existing parameters are likely
candidates for being in such a struct. e.g. hasRecursion and
tuple_fraction. A SetOperationStmt could go in there too.

The attached is roughly what I had in mind. I've not taken the time
to see what comments need to be updated, so the attached aims only to
assist discussion.

I like this idea. And there may be future applications for having such
a struct if we want to pass down additional information to subquery
planning, such as ordering requirements from outer query.

Thanks
Richard

#23

Tom Lane

tgl@sss.pgh.pa.us

almost 2 years ago

In reply to: Richard Guo (#22)

Re: Properly pathify the union planner

Richard Guo <guofenglinux@gmail.com> writes:

On Wed, Mar 27, 2024 at 6:34 PM David Rowley <dgrowleyml@gmail.com> wrote:

The attached is roughly what I had in mind. I've not taken the time
to see what comments need to be updated, so the attached aims only to
assist discussion.

I like this idea.

I haven't studied the underlying problem yet, so I'm not quite
buying into whether we need this struct at all ... but assuming
we do, I feel like "PlannerContext" is a pretty poor name.
There's basically nothing to distinguish it from "PlannerInfo",
not to mention that readers would likely assume it's a memory
context of some sort.

Perhaps "SubqueryContext" or the like would be better? It
still has the conflict with memory contexts though.

regards, tom lane

#24

David Rowley

dgrowleyml@gmail.com

almost 2 years ago

In reply to: Tom Lane (#23)

Re: Properly pathify the union planner

On Thu, 28 Mar 2024 at 15:56, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I haven't studied the underlying problem yet, so I'm not quite
buying into whether we need this struct at all ...

The problem is, when planning a UNION child query, we want to try and
produce some Paths that would suit the top-level UNION query so that a
Merge Append -> Unique can be used rather than a Append -> Sort ->
Unique or Append -> Hash Aggregate.

The problem is informing the UNION child query about what it is. I
thought I could do root->parent_root->parse->setOperations for a UNION
child to know what it is, but that breaks for a query such as:

WITH q1(x) AS (SELECT 1)
SELECT FROM q1 UNION SELECT FROM q1

as the CTE also has root->parent_root->parse->setOperations set and in
the above case, that's a problem as there's some code that tries to
match the non-resjunk child targetlists up with the SetOperationStmt's
SortGroupClauses, but there's a mismatch for the CTE. The actual
UNION children should have a 1:1 match for non-junk columns.

but assuming
we do, I feel like "PlannerContext" is a pretty poor name.
There's basically nothing to distinguish it from "PlannerInfo",
not to mention that readers would likely assume it's a memory
context of some sort.

Perhaps "SubqueryContext" or the like would be better? It
still has the conflict with memory contexts though.

Maybe something with "Parameters" in the name?

David

#25

Tom Lane

tgl@sss.pgh.pa.us

almost 2 years ago

In reply to: David Rowley (#24)

Re: Properly pathify the union planner

David Rowley <dgrowleyml@gmail.com> writes:

The problem is informing the UNION child query about what it is. I
thought I could do root->parent_root->parse->setOperations for a UNION
child to know what it is, but that breaks for a query such as:

Yeah, having grouping_planner poke into the parent level
doesn't seem like a great idea here. I continue to not like
the name "PlannerContext" but I agree passing down the setop
explicitly is the way to go.

Perhaps "SubqueryContext" or the like would be better? It
still has the conflict with memory contexts though.

Maybe something with "Parameters" in the name?

SubqueryParameters might be OK. Or SubqueryPlannerExtra?
Since this is a bespoke struct that will probably only ever
be used with subquery_planner, naming it after that function
seems like a good idea. (And, given that fact and the fact
that it's not a Node, I'm not sure it belongs in pathnodes.h.
We could just declare it in planner.h.)

Some minor comments now that I've looked at 66c0185a3 a little:

* Near the head of grouping_planner is this bit:

if (parse->setOperations)
{
/*
* If there's a top-level ORDER BY, assume we have to fetch all the
* tuples. This might be too simplistic given all the hackery below
* to possibly avoid the sort; but the odds of accurate estimates here
* are pretty low anyway. XXX try to get rid of this in favor of
* letting plan_set_operations generate both fast-start and
* cheapest-total paths.
*/
if (parse->sortClause)
root->tuple_fraction = 0.0;

I'm pretty sure this comment is mine, but it's old enough that I don't
recall exactly what I had in mind. Still, it seems like your patch
has addressed precisely the issue of generating fast-start plans for
setops. Should we now remove this reset of tuple_fraction?

* generate_setop_child_grouplist does this:

/* assign a tleSortGroupRef, or reuse the existing one */
sgc->tleSortGroupRef = assignSortGroupRef(tle, targetlist);
tle->ressortgroupref = sgc->tleSortGroupRef;

That last line is redundant and confusing. It is not this code's
charter to change ressortgroupref.

regards, tom lane

#26

Tom Lane

tgl@sss.pgh.pa.us

almost 2 years ago

In reply to: Tom Lane (#25)

Re: Properly pathify the union planner

I wrote:

David Rowley <dgrowleyml@gmail.com> writes:

Maybe something with "Parameters" in the name?

SubqueryParameters might be OK. Or SubqueryPlannerExtra?
Since this is a bespoke struct that will probably only ever
be used with subquery_planner, naming it after that function
seems like a good idea.

On third thought, I'm not at all convinced that we even want to
invent this struct as compared to just adding another parameter
to subquery_planner. The problem with a struct is what happens
the next time we need to add a parameter? If we add yet another
function parameter, we can count on the compiler to complain
about call sites that didn't get the memo. Adding a field
within an existing struct provokes no such warning, leading
to situations with uninitialized fields that might accidentally
work during testing, but fail the minute they get to the field.

If you do want to go this direction, a minimum safety requirement
would be to have an ironclad rule that callers memset the whole
struct to zero before filling it, so that any not-set fields
will at least have predictable values. But I don't see the
point really.

regards, tom lane

#27

David Rowley

dgrowleyml@gmail.com

almost 2 years ago

In reply to: Tom Lane (#26)

Re: Properly pathify the union planner

On Fri, 29 Mar 2024 at 08:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:

On third thought, I'm not at all convinced that we even want to
invent this struct as compared to just adding another parameter
to subquery_planner. The problem with a struct is what happens
the next time we need to add a parameter? If we add yet another
function parameter, we can count on the compiler to complain
about call sites that didn't get the memo. Adding a field
within an existing struct provokes no such warning, leading
to situations with uninitialized fields that might accidentally
work during testing, but fail the minute they get to the field.

I agree it's best to break callers that don't update their code to
consider passing or not passing a SetOperationStmt. I've just
committed a fix to do it that way. This also seems to be the path of
least resistance, which also appeals.

I opted to add a new test alongside the existing tests which validate
set operations with an empty SELECT list work. The new tests include
the variation that the set operation has both a materialized and
non-materialized CTE as a child. This was only a problem with a
materialized CTE, but I opted to include a non-materialized one as I
don't expect that we'll get this exact problem again. I was just keen
on getting more coverage with a couple of cheap tests.

Thanks for your input on this. I'll review your other comments shortly.

David