parallel append vs. simple UNION ALL

Started by Robert Haasabout 8 years ago18 messages

robertmhaas@gmail.com

about 8 years ago

1 attachment(s)

As I mentioned in the commit message for the Parallel Append commit
(ab72716778128fb63d54ac256adf7fe6820a1185), it's kind of sad that this
doesn't work with UNION ALL queries, which are an obvious candidate
for such parallelization. It turns out that it actually does work to
a limited degree: assuming that the UNION ALL query can be converted
to a simple appendrel, it can consider a parallel append of
non-partial paths only. The attached patch lets it consider a
parallel append of partial paths by doing the following things:

1. Teaching set_subquery_pathlist to create *partial* SubqueryScan
paths as well as non-partial ones.
2. Teaching grouping_planner to create partial paths for the final rel
if not at the outermost query level.
3. Modifying finalize_plan to allow the gather_param to be passed
across subquery boundaries.

#3 is the only part I'm really unsure about; the other stuff looks
pretty cut and dried.

I have a draft patch that handles the case where the union can't be
converted to a simple appendrel, too, but that's not quite baked
enough to post yet.

For those for whom the above may be too technical to follow, here's an example:

pgbench -i 40
explain (costs off) select a.bid from pgbench_accounts a,
pgbench_branches b where a.bid = b.bid and aid % 1000 = 0 union all
select a.bid from pgbench_accounts a where aid % 1000 = 0;

Unpatched:

Append
-> Gather
Workers Planned: 2
-> Hash Join
Hash Cond: (a.bid = b.bid)
-> Parallel Seq Scan on pgbench_accounts a
Filter: ((aid % 1000) = 0)
-> Hash
-> Seq Scan on pgbench_branches b
-> Gather
Workers Planned: 2
-> Parallel Seq Scan on pgbench_accounts a_1
Filter: ((aid % 1000) = 0)

Patched:

Gather
Workers Planned: 2
-> Parallel Append
-> Hash Join
Hash Cond: (a.bid = b.bid)
-> Parallel Seq Scan on pgbench_accounts a
Filter: ((aid % 1000) = 0)
-> Hash
-> Seq Scan on pgbench_branches b
-> Parallel Seq Scan on pgbench_accounts a_1
Filter: ((aid % 1000) = 0)

In this particular case the change doesn't buy very much, but the
second plan is better because avoid shutting down one set of workers
and starting a new set. That's more efficient, plus it allows the two
branches to be worked in parallel rather than serially. On a small
enough scale factor, even without the patch, you get this...

Gather
Workers Planned: 2
-> Parallel Append
-> Nested Loop
Join Filter: (a.bid = b.bid)
-> Seq Scan on pgbench_branches b
-> Seq Scan on pgbench_accounts a
Filter: ((aid % 1000) = 0)
-> Seq Scan on pgbench_accounts a_1
Filter: ((aid % 1000) = 0)

...but that's not good because now we have regular sequential scans
instead of partial sequential scans.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

subquery-smarts.patchapplication/octet-stream; name=subquery-smarts.patchDownload

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 0e8463e4a3..3181a39042 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -2177,6 +2177,24 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 				 create_subqueryscan_path(root, rel, subpath,
 										  pathkeys, required_outer));
 	}
+
+	/* Same for partial paths. */
+	foreach(lc, sub_final_rel->partial_pathlist)
+	{
+		Path	   *subpath = (Path *) lfirst(lc);
+		List	   *pathkeys;
+
+		/* Convert subpath's pathkeys to outer representation */
+		pathkeys = convert_subquery_pathkeys(root,
+											 rel,
+											 subpath->pathkeys,
+											 make_tlist_from_pathtarget(subpath->pathtarget));
+
+		/* Generate outer path using this subpath */
+		add_partial_path(rel, (Path *)
+				 create_subqueryscan_path(root, rel, subpath,
+										  pathkeys, required_outer));
+	}
 }
 
 /*
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 382791fadb..bacb46ab24 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -2137,6 +2137,21 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 		add_path(final_rel, path);
 	}
 
+	/*
+	 * Generate partial paths for final_rel, too, if outer query levels
+	 * might be able to make use of them.
+	 */
+	if (final_rel->consider_parallel && root->query_level > 1 &&
+		!limit_needed(parse))
+	{
+		Assert(!parse->rowMarks && parse->commandType == CMD_SELECT);
+		foreach(lc, current_rel->partial_pathlist)
+		{
+			Path	   *partial_path = (Path *) lfirst(lc);
+			add_partial_path(final_rel, partial_path);
+		}
+	}
+
 	/*
 	 * If there is an FDW that's responsible for all baserels of the query,
 	 * let it consider adding ForeignPaths.
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 2e3abeea3d..2e213e75f1 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -2407,10 +2407,16 @@ finalize_plan(PlannerInfo *root, Plan *plan,
 			{
 				SubqueryScan *sscan = (SubqueryScan *) plan;
 				RelOptInfo *rel;
+				Bitmapset  *subquery_params;
 
-				/* We must run SS_finalize_plan on the subquery */
+				/* We must run finalize_plan on the subquery */
 				rel = find_base_rel(root, sscan->scan.scanrelid);
-				SS_finalize_plan(rel->subroot, sscan->subplan);
+				subquery_params = rel->subroot->outer_params;
+				if (gather_param >= 0)
+					subquery_params = bms_add_member(bms_copy(subquery_params),
+													 gather_param);
+				finalize_plan(rel->subroot, sscan->subplan, gather_param,
+							  subquery_params, NULL);
 
 				/* Now we can add its extParams to the parent's params */
 				context.paramids = bms_add_members(context.paramids,

Robert Haas

robertmhaas@gmail.com

almost 8 years ago

In reply to: Robert Haas (#1)

4 attachment(s)

Re: parallel append vs. simple UNION ALL

On Sat, Dec 23, 2017 at 4:53 PM, Robert Haas <robertmhaas@gmail.com> wrote:

As I mentioned in the commit message for the Parallel Append commit
(ab72716778128fb63d54ac256adf7fe6820a1185), it's kind of sad that this
doesn't work with UNION ALL queries, which are an obvious candidate
for such parallelization. It turns out that it actually does work to
a limited degree: assuming that the UNION ALL query can be converted
to a simple appendrel, it can consider a parallel append of
non-partial paths only. The attached patch lets it consider a
parallel append of partial paths ...

Here's an extended series of patches that now handles both the simple
UNION ALL case (where we flatten it) and the unflattened case:

0001 is pretty much the same as the subquery-smarts.patch file I
attached to the previous email. I don't see much reason not to go
ahead and commit this, although it could use a test case. It makes
the simple/flattened case work. After some study I think that the
gather-parameter handling is correct, although if somebody felt like
reviewing that portion especially I wouldn't say no.

0002 rewrites recurse_union_children to work iteratively rather than
recursively and renames it to plan_union_children. This probably
isn't 100% necessary, but it seems to me that the resulting code is
easier to understand, and it reduces the risk of blowing out the
stack. There should be no user-visible behavior change.

0003 rewrites the setop planner to create a separate upper rel for
each stage of setop planning and uses them to return paths instead of
returning paths directly. This is necessary preparatory work for
anything that wants to consider multiple possible paths for queries
that go through the full setop planner, but it shouldn't have any
visible impact all by itself.

0004 causes generate_union_path() to consider both the traditional
method and also Gather -> Parallel Append -> [partial path for each
subquery]. This is still a bit rough around the edges and there's a
lot more that could be done here, but I'm posting what I have for now
in the (perhaps vain) hope of getting some feedback. With this, you
can use Parallel Append for the UNION ALL step of a query like SELECT
.. UNION ALL .. SELECT ... EXCEPT SELECT ...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

0004-Consider-Parallel-Append-as-a-way-to-implement-a-uni.patchapplication/octet-stream; name=0004-Consider-Parallel-Append-as-a-way-to-implement-a-uni.patchDownload

From d5689922919f425eed266c867cfdf44f9aff8a24 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Sat, 23 Dec 2017 10:56:07 -0800
Subject: [PATCH 4/4] Consider Parallel Append as a way to implement a union
 operation in a setop tree.

---
 src/backend/optimizer/prep/prepunion.c | 93 +++++++++++++++++++++++++++++++++-
 1 file changed, 91 insertions(+), 2 deletions(-)

diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 6e5524046f..2f259ab0fe 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -298,12 +298,18 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		 */
 		set_subquery_size_estimates(root, rel);
 
+		/*
+		 * Since we may want to add a partial path to this relation, we must
+		 * set its consider_parallel flag correctly.
+		 */
+		final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
+		rel->consider_parallel = final_rel->consider_parallel;
+
 		/*
 		 * For the moment, we consider only a single Path for the subquery.
 		 * This should change soon (make it look more like
 		 * set_subquery_pathlist).
 		 */
-		final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
 		subpath = get_cheapest_fractional_path(final_rel,
 											   root->tuple_fraction);
 
@@ -320,6 +326,23 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 
 		add_path(rel, path);
 
+		/*
+		 * If we have a partial path for the child relation, we can use that
+		 * to build a partial path for this relation.  But there's no point in
+		 * considering any path but the cheapest.
+		 */
+		if (final_rel->partial_pathlist != NIL)
+		{
+			Path	   *partial_subpath;
+			Path	   *partial_path;
+
+			partial_subpath = linitial(final_rel->partial_pathlist);
+			partial_path = (Path *)
+				create_subqueryscan_path(root, rel, partial_subpath,
+										 NIL, NULL);
+			add_partial_path(rel, partial_path);
+		}
+
 		/*
 		 * Estimate number of groups if caller wants it.  If the subquery used
 		 * grouping or aggregation, its output is probably mostly unique
@@ -552,6 +575,9 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 	double		save_fraction = root->tuple_fraction;
 	ListCell   *lc;
 	List	   *pathlist = NIL;
+	List	   *partial_pathlist = NIL;
+	bool		partial_paths_valid = true;
+	bool		consider_parallel = true;
 	List	   *rellist;
 	List	   *tlist_list;
 	List	   *tlist;
@@ -591,18 +617,34 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 
 	*pTargetList = tlist;
 
-	/* Build path list and relid set. */
+	/* Build path lists and relid set. */
 	foreach(lc, rellist)
 	{
 		RelOptInfo *rel = lfirst(lc);
 
 		pathlist = lappend(pathlist, rel->cheapest_total_path);
+
+		if (consider_parallel)
+		{
+			if (!rel->consider_parallel)
+			{
+				consider_parallel = false;
+				partial_paths_valid = false;
+			}
+			else if (rel->partial_pathlist == NIL)
+				partial_paths_valid = false;
+			else
+				partial_pathlist = lappend(partial_pathlist,
+										   linitial(rel->partial_pathlist));
+		}
+
 		relids = bms_union(relids, rel->relids);
 	}
 
 	/* Build result relation. */
 	result_rel = fetch_upper_rel(root, UPPERREL_SETOP, relids);
 	result_rel->reltarget = create_pathtarget(root, tlist);
+	result_rel->consider_parallel = consider_parallel;
 
 	/*
 	 * Append the child results together.
@@ -626,6 +668,53 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 	 */
 	result_rel->rows = path->rows;
 
+	/*
+	 * Now consider doing the same thing using the partial paths plus Append
+	 * plus Gather.
+	 */
+	if (partial_paths_valid)
+	{
+		Path	   *ppath;
+		ListCell   *lc;
+		int			parallel_workers = 0;
+
+		/* Find the highest number of workers requested for any subpath. */
+		foreach(lc, partial_pathlist)
+		{
+			Path	   *path = lfirst(lc);
+
+			parallel_workers = Max(parallel_workers, path->parallel_workers);
+		}
+		Assert(parallel_workers > 0);
+
+		/*
+		 * If the use of parallel append is permitted, always request at least
+		 * log2(# of children) paths.  We assume it can be useful to have
+		 * extra workers in this case because they will be spread out across
+		 * the children.  The precise formula is just a guess; see
+		 * add_paths_to_append_rel.
+		 */
+		if (enable_parallel_append)
+		{
+			parallel_workers = Max(parallel_workers,
+								   fls(list_length(partial_pathlist)));
+			parallel_workers = Min(parallel_workers,
+								   max_parallel_workers_per_gather);
+		}
+		Assert(parallel_workers > 0);
+
+		ppath = (Path *)
+			create_append_path(result_rel, NIL, partial_pathlist,
+							   NULL, parallel_workers, enable_parallel_append,
+							   NIL /* XXX? Is this right? */ , -1);
+		ppath = (Path *)
+			create_gather_path(root, result_rel, ppath,
+							   result_rel->reltarget, NULL, NULL);
+		if (!op->all)
+			ppath = make_union_unique(op, ppath, tlist, root);
+		add_path(result_rel, ppath);
+	}
+
 	/* Undo effects of possibly forcing tuple_fraction to 0 */
 	root->tuple_fraction = save_fraction;
 
-- 
2.14.3 (Apple Git-98)

0003-Generate-a-separate-upper-relation-for-each-stage-of.patchapplication/octet-stream; name=0003-Generate-a-separate-upper-relation-for-each-stage-of.patchDownload

From e9edd8bcd486eedf7d3b4819849c3a9f09487172 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Fri, 23 Feb 2018 15:42:51 -0500
Subject: [PATCH 3/4] Generate a separate upper relation for each stage of
 setop planning.

Commit 3fc6e2d7f5b652b417fa6937c34de2438d60fa9f made setop planning
stages return paths rather than plans, but all such paths were loosely
associated with a single RelOptInfo, and only the final path was added
to the RelOptInfo.  Even at the time, it was foreseen that this should
be changed, because there is otherwise no good way for a single stage
of setop planning to return multiple paths.  With this patch, each
stage of set operation planning now creates a separate RelOptInfo;
these are distinguished by using appropriate relid sets.  Note that
this patch does nothing whatsoever about actually returning multiple
paths for the same set operation; it just makes it possible for a
future patch to do so.

Along the way, adjust things so that create_upper_paths_hook is called
for each of these new RelOptInfos rather than just once, since that
might be useful to extensions using that hook.  It might be a good
to provide an FDW API here as well, but I didn't try to do that for
now.

Patch by me.
---
 src/backend/optimizer/prep/prepunion.c | 330 ++++++++++++++++++---------------
 1 file changed, 185 insertions(+), 145 deletions(-)

diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index f387387289..6e5524046f 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -60,30 +60,29 @@ typedef struct
 	AppendRelInfo **appinfos;
 } adjust_appendrel_attrs_context;
 
-static Path *recurse_set_operations(Node *setOp, PlannerInfo *root,
+static RelOptInfo *recurse_set_operations(Node *setOp, PlannerInfo *root,
 					   List *colTypes, List *colCollations,
 					   bool junkOK,
 					   int flag, List *refnames_tlist,
 					   List **pTargetList,
 					   double *pNumGroups);
-static Path *generate_recursion_path(SetOperationStmt *setOp,
+static RelOptInfo *generate_recursion_path(SetOperationStmt *setOp,
 						PlannerInfo *root,
 						List *refnames_tlist,
 						List **pTargetList);
-static Path *generate_union_path(SetOperationStmt *op, PlannerInfo *root,
+static RelOptInfo *generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 					List *refnames_tlist,
-					List **pTargetList,
-					double *pNumGroups);
-static Path *generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
+					List **pTargetList);
+static RelOptInfo *generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
 					   List *refnames_tlist,
-					   List **pTargetList,
-					   double *pNumGroups);
+					   List **pTargetList);
 static List *plan_union_children(PlannerInfo *root,
 					SetOperationStmt *top_union,
 					List *refnames_tlist,
 					List **tlist_list);
 static Path *make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
 				  PlannerInfo *root);
+static void postprocess_setop_rel(PlannerInfo *root, RelOptInfo *rel);
 static bool choose_hashed_setop(PlannerInfo *root, List *groupClauses,
 					Path *input_path,
 					double dNumGroups, double dNumOutputRows,
@@ -149,7 +148,6 @@ plan_set_operations(PlannerInfo *root)
 	RangeTblEntry *leftmostRTE;
 	Query	   *leftmostQuery;
 	RelOptInfo *setop_rel;
-	Path	   *path;
 	List	   *top_tlist;
 
 	Assert(topop);
@@ -181,57 +179,34 @@ plan_set_operations(PlannerInfo *root)
 	leftmostQuery = leftmostRTE->subquery;
 	Assert(leftmostQuery != NULL);
 
-	/*
-	 * We return our results in the (SETOP, NULL) upperrel.  For the moment,
-	 * this is also the parent rel of all Paths in the setop tree; we may well
-	 * change that in future.
-	 */
-	setop_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
-
-	/*
-	 * We don't currently worry about setting setop_rel's consider_parallel
-	 * flag, nor about allowing FDWs to contribute paths to it.
-	 */
-
 	/*
 	 * If the topmost node is a recursive union, it needs special processing.
 	 */
 	if (root->hasRecursion)
 	{
-		path = generate_recursion_path(topop, root,
-									   leftmostQuery->targetList,
-									   &top_tlist);
+		setop_rel = generate_recursion_path(topop, root,
+											leftmostQuery->targetList,
+											&top_tlist);
 	}
 	else
 	{
 		/*
 		 * Recurse on setOperations tree to generate paths for set ops. The
-		 * final output path should have just the column types shown as the
+		 * final output paths should have just the column types shown as the
 		 * output from the top-level node, plus possibly resjunk working
 		 * columns (we can rely on upper-level nodes to deal with that).
 		 */
-		path = recurse_set_operations((Node *) topop, root,
-									  topop->colTypes, topop->colCollations,
-									  true, -1,
-									  leftmostQuery->targetList,
-									  &top_tlist,
-									  NULL);
+		setop_rel = recurse_set_operations((Node *) topop, root,
+										   topop->colTypes, topop->colCollations,
+										   true, -1,
+										   leftmostQuery->targetList,
+										   &top_tlist,
+										   NULL);
 	}
 
 	/* Must return the built tlist into root->processed_tlist. */
 	root->processed_tlist = top_tlist;
 
-	/* Add only the final path to the SETOP upperrel. */
-	add_path(setop_rel, path);
-
-	/* Let extensions possibly add some more paths */
-	if (create_upper_paths_hook)
-		(*create_upper_paths_hook) (root, UPPERREL_SETOP,
-									NULL, setop_rel);
-
-	/* Select cheapest path */
-	set_cheapest(setop_rel);
-
 	return setop_rel;
 }
 
@@ -245,21 +220,21 @@ plan_set_operations(PlannerInfo *root)
  * flag: if >= 0, add a resjunk output column indicating value of flag
  * refnames_tlist: targetlist to take column names from
  *
- * Returns a path for the subtree, as well as these output parameters:
+ * Returns a RelOptInfo for the subtree, as well as these output parameters:
  * *pTargetList: receives the fully-fledged tlist for the subtree's top plan
  * *pNumGroups: if not NULL, we estimate the number of distinct groups
  *		in the result, and store it there
  *
  * The pTargetList output parameter is mostly redundant with the pathtarget
- * of the returned path, but for the moment we need it because much of the
- * logic in this file depends on flag columns being marked resjunk.  Pending
- * a redesign of how that works, this is the easy way out.
+ * of the returned RelOptInfo, but for the moment we need it because much of
+ * the logic in this file depends on flag columns being marked resjunk.
+ * Pending a redesign of how that works, this is the easy way out.
  *
  * We don't have to care about typmods here: the only allowed difference
  * between set-op input and output typmods is input is a specific typmod
  * and output is -1, and that does not require a coercion.
  */
-static Path *
+static RelOptInfo *
 recurse_set_operations(Node *setOp, PlannerInfo *root,
 					   List *colTypes, List *colCollations,
 					   bool junkOK,
@@ -267,6 +242,8 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 					   List **pTargetList,
 					   double *pNumGroups)
 {
+	RelOptInfo *rel = NULL;		/* keep compiler quiet */
+
 	/* Guard against stack overflow due to overly complex setop nests */
 	check_stack_depth();
 
@@ -275,7 +252,6 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		RangeTblRef *rtr = (RangeTblRef *) setOp;
 		RangeTblEntry *rte = root->simple_rte_array[rtr->rtindex];
 		Query	   *subquery = rte->subquery;
-		RelOptInfo *rel;
 		PlannerInfo *subroot;
 		RelOptInfo *final_rel;
 		Path	   *subpath;
@@ -284,11 +260,7 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 
 		Assert(subquery != NULL);
 
-		/*
-		 * We need to build a RelOptInfo for each leaf subquery.  This isn't
-		 * used for much here, but it carries the subroot data structures
-		 * forward to setrefs.c processing.
-		 */
+		/* Build a RelOptInfo for this leaf subquery. */
 		rel = build_simple_rel(root, rtr->rtindex, NULL);
 
 		/* plan_params should not be in use in current query level */
@@ -307,6 +279,18 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		if (root->plan_params)
 			elog(ERROR, "unexpected outer reference in set operation subquery");
 
+		/* Figure out the appropriate target list for this subquery. */
+		tlist = generate_setop_tlist(colTypes, colCollations,
+									 flag,
+									 rtr->rtindex,
+									 true,
+									 subroot->processed_tlist,
+									 refnames_tlist);
+		rel->reltarget = create_pathtarget(root, tlist);
+
+		/* Return the fully-fledged tlist to caller, too */
+		*pTargetList = tlist;
+
 		/*
 		 * Mark rel with estimated output rows, width, etc.  Note that we have
 		 * to do this before generating outer-query paths, else
@@ -334,22 +318,7 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		path = (Path *) create_subqueryscan_path(root, rel, subpath,
 												 NIL, NULL);
 
-		/*
-		 * Figure out the appropriate target list, and update the
-		 * SubqueryScanPath with the PathTarget form of that.
-		 */
-		tlist = generate_setop_tlist(colTypes, colCollations,
-									 flag,
-									 rtr->rtindex,
-									 true,
-									 subroot->processed_tlist,
-									 refnames_tlist);
-
-		path = apply_projection_to_path(root, rel, path,
-										create_pathtarget(root, tlist));
-
-		/* Return the fully-fledged tlist to caller, too */
-		*pTargetList = tlist;
+		add_path(rel, path);
 
 		/*
 		 * Estimate number of groups if caller wants it.  If the subquery used
@@ -378,25 +347,22 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 												  subpath->rows,
 												  NULL);
 		}
-
-		return (Path *) path;
 	}
 	else if (IsA(setOp, SetOperationStmt))
 	{
 		SetOperationStmt *op = (SetOperationStmt *) setOp;
-		Path	   *path;
 
 		/* UNIONs are much different from INTERSECT/EXCEPT */
 		if (op->op == SETOP_UNION)
-			path = generate_union_path(op, root,
-									   refnames_tlist,
-									   pTargetList,
-									   pNumGroups);
+			rel = generate_union_path(op, root,
+									  refnames_tlist,
+									  pTargetList);
 		else
-			path = generate_nonunion_path(op, root,
-										  refnames_tlist,
-										  pTargetList,
-										  pNumGroups);
+			rel = generate_nonunion_path(op, root,
+										 refnames_tlist,
+										 pTargetList);
+		if (pNumGroups)
+			*pNumGroups = rel->rows;
 
 		/*
 		 * If necessary, add a Result node to project the caller-requested
@@ -415,39 +381,70 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 			!tlist_same_datatypes(*pTargetList, colTypes, junkOK) ||
 			!tlist_same_collations(*pTargetList, colCollations, junkOK))
 		{
+			PathTarget *target;
+			ListCell   *lc;
+
 			*pTargetList = generate_setop_tlist(colTypes, colCollations,
 												flag,
 												0,
 												false,
 												*pTargetList,
 												refnames_tlist);
-			path = apply_projection_to_path(root,
-											path->parent,
-											path,
-											create_pathtarget(root,
-															  *pTargetList));
+			target = create_pathtarget(root, *pTargetList);
+
+			/* Apply projection to each path */
+			foreach(lc, rel->pathlist)
+			{
+				Path	   *subpath = (Path *) lfirst(lc);
+				Path	   *path;
+
+				Assert(subpath->param_info == NULL);
+				path = apply_projection_to_path(root, subpath->parent,
+												subpath, target);
+				/* If we had to add a Result, path is different from subpath */
+				if (path != subpath)
+					lfirst(lc) = path;
+			}
+
+			/* Apply projection to each partial path */
+			foreach(lc, rel->partial_pathlist)
+			{
+				Path	   *subpath = (Path *) lfirst(lc);
+				Path	   *path;
+
+				Assert(subpath->param_info == NULL);
+
+				/* avoid apply_projection_to_path, in case of multiple refs */
+				path = (Path *) create_projection_path(root, subpath->parent,
+													   subpath, target);
+				lfirst(lc) = path;
+			}
 		}
-		return path;
 	}
 	else
 	{
 		elog(ERROR, "unrecognized node type: %d",
 			 (int) nodeTag(setOp));
 		*pTargetList = NIL;
-		return NULL;			/* keep compiler quiet */
 	}
+
+	postprocess_setop_rel(root, rel);
+
+	return rel;
 }
 
 /*
- * Generate path for a recursive UNION node
+ * Generate paths for a recursive UNION node
  */
-static Path *
+static RelOptInfo *
 generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 						List *refnames_tlist,
 						List **pTargetList)
 {
-	RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
+	RelOptInfo *result_rel;
 	Path	   *path;
+	RelOptInfo *lrel,
+			   *rrel;
 	Path	   *lpath;
 	Path	   *rpath;
 	List	   *lpath_tlist;
@@ -466,20 +463,22 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 	 * Unlike a regular UNION node, process the left and right inputs
 	 * separately without any intention of combining them into one Append.
 	 */
-	lpath = recurse_set_operations(setOp->larg, root,
-								   setOp->colTypes, setOp->colCollations,
-								   false, -1,
-								   refnames_tlist,
-								   &lpath_tlist,
-								   NULL);
+	lrel = recurse_set_operations(setOp->larg, root,
+								  setOp->colTypes, setOp->colCollations,
+								  false, -1,
+								  refnames_tlist,
+								  &lpath_tlist,
+								  NULL);
+	lpath = lrel->cheapest_total_path;
 	/* The right path will want to look at the left one ... */
 	root->non_recursive_path = lpath;
-	rpath = recurse_set_operations(setOp->rarg, root,
-								   setOp->colTypes, setOp->colCollations,
-								   false, -1,
-								   refnames_tlist,
-								   &rpath_tlist,
-								   NULL);
+	rrel = recurse_set_operations(setOp->rarg, root,
+								  setOp->colTypes, setOp->colCollations,
+								  false, -1,
+								  refnames_tlist,
+								  &rpath_tlist,
+								  NULL);
+	rpath = rrel->cheapest_total_path;
 	root->non_recursive_path = NULL;
 
 	/*
@@ -491,6 +490,11 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 
 	*pTargetList = tlist;
 
+	/* Build result relation. */
+	result_rel = fetch_upper_rel(root, UPPERREL_SETOP,
+								 bms_union(lrel->relids, rrel->relids));
+	result_rel->reltarget = create_pathtarget(root, tlist);
+
 	/*
 	 * If UNION, identify the grouping operators
 	 */
@@ -525,26 +529,30 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 											   result_rel,
 											   lpath,
 											   rpath,
-											   create_pathtarget(root, tlist),
+											   result_rel->reltarget,
 											   groupList,
 											   root->wt_param_id,
 											   dNumGroups);
 
-	return path;
+	add_path(result_rel, path);
+	postprocess_setop_rel(root, result_rel);
+	return result_rel;
 }
 
 /*
- * Generate path for a UNION or UNION ALL node
+ * Generate paths for a UNION or UNION ALL node
  */
-static Path *
+static RelOptInfo *
 generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 					List *refnames_tlist,
-					List **pTargetList,
-					double *pNumGroups)
+					List **pTargetList)
 {
-	RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
+	Relids		relids = NULL;
+	RelOptInfo *result_rel;
 	double		save_fraction = root->tuple_fraction;
-	List	   *pathlist;
+	ListCell   *lc;
+	List	   *pathlist = NIL;
+	List	   *rellist;
 	List	   *tlist_list;
 	List	   *tlist;
 	Path	   *path;
@@ -569,7 +577,7 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 	 * only one Append and unique-ification for the lot.  Recurse to find such
 	 * nodes and compute their children's paths.
 	 */
-	pathlist = plan_union_children(root, op, refnames_tlist, &tlist_list);
+	rellist = plan_union_children(root, op, refnames_tlist, &tlist_list);
 
 	/*
 	 * Generate tlist for Append plan node.
@@ -583,13 +591,24 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 
 	*pTargetList = tlist;
 
+	/* Build path list and relid set. */
+	foreach(lc, rellist)
+	{
+		RelOptInfo *rel = lfirst(lc);
+
+		pathlist = lappend(pathlist, rel->cheapest_total_path);
+		relids = bms_union(relids, rel->relids);
+	}
+
+	/* Build result relation. */
+	result_rel = fetch_upper_rel(root, UPPERREL_SETOP, relids);
+	result_rel->reltarget = create_pathtarget(root, tlist);
+
 	/*
 	 * Append the child results together.
 	 */
 	path = (Path *) create_append_path(result_rel, pathlist, NIL,
 									   NULL, 0, false, NIL, -1);
-	/* We have to manually jam the right tlist into the path; ick */
-	path->pathtarget = create_pathtarget(root, tlist);
 
 	/*
 	 * For UNION ALL, we just need the Append path.  For UNION, need to add
@@ -598,30 +617,32 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 	if (!op->all)
 		path = make_union_unique(op, path, tlist, root);
 
+	add_path(result_rel, path);
+
 	/*
-	 * Estimate number of groups if caller wants it.  For now we just assume
-	 * the output is unique --- this is certainly true for the UNION case, and
-	 * we want worst-case estimates anyway.
+	 * Estimate number of groups.  For now we just assume the output is unique
+	 * --- this is certainly true for the UNION case, and we want worst-case
+	 * estimates anyway.
 	 */
-	if (pNumGroups)
-		*pNumGroups = path->rows;
+	result_rel->rows = path->rows;
 
 	/* Undo effects of possibly forcing tuple_fraction to 0 */
 	root->tuple_fraction = save_fraction;
 
-	return path;
+	return result_rel;
 }
 
 /*
- * Generate path for an INTERSECT, INTERSECT ALL, EXCEPT, or EXCEPT ALL node
+ * Generate paths for an INTERSECT, INTERSECT ALL, EXCEPT, or EXCEPT ALL node
  */
-static Path *
+static RelOptInfo *
 generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
 					   List *refnames_tlist,
-					   List **pTargetList,
-					   double *pNumGroups)
+					   List **pTargetList)
 {
-	RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
+	RelOptInfo *result_rel;
+	RelOptInfo *lrel,
+			   *rrel;
 	double		save_fraction = root->tuple_fraction;
 	Path	   *lpath,
 			   *rpath,
@@ -646,18 +667,20 @@ generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
 	root->tuple_fraction = 0.0;
 
 	/* Recurse on children, ensuring their outputs are marked */
-	lpath = recurse_set_operations(op->larg, root,
-								   op->colTypes, op->colCollations,
-								   false, 0,
-								   refnames_tlist,
-								   &lpath_tlist,
-								   &dLeftGroups);
-	rpath = recurse_set_operations(op->rarg, root,
-								   op->colTypes, op->colCollations,
-								   false, 1,
-								   refnames_tlist,
-								   &rpath_tlist,
-								   &dRightGroups);
+	lrel = recurse_set_operations(op->larg, root,
+								  op->colTypes, op->colCollations,
+								  false, 0,
+								  refnames_tlist,
+								  &lpath_tlist,
+								  &dLeftGroups);
+	lpath = lrel->cheapest_total_path;
+	rrel = recurse_set_operations(op->rarg, root,
+								  op->colTypes, op->colCollations,
+								  false, 1,
+								  refnames_tlist,
+								  &rpath_tlist,
+								  &dRightGroups);
+	rpath = rrel->cheapest_total_path;
 
 	/* Undo effects of forcing tuple_fraction to 0 */
 	root->tuple_fraction = save_fraction;
@@ -695,15 +718,17 @@ generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
 
 	*pTargetList = tlist;
 
+	/* Build result relation. */
+	result_rel = fetch_upper_rel(root, UPPERREL_SETOP,
+								 bms_union(lrel->relids, rrel->relids));
+	result_rel->reltarget = create_pathtarget(root, tlist);;
+
 	/*
 	 * Append the child results together.
 	 */
 	path = (Path *) create_append_path(result_rel, pathlist, NIL,
 									   NULL, 0, false, NIL, -1);
 
-	/* We have to manually jam the right tlist into the path; ick */
-	path->pathtarget = create_pathtarget(root, tlist);
-
 	/* Identify the grouping semantics */
 	groupList = generate_setop_grouplist(op, tlist);
 
@@ -769,10 +794,9 @@ generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
 									  dNumGroups,
 									  dNumOutputRows);
 
-	if (pNumGroups)
-		*pNumGroups = dNumGroups;
-
-	return path;
+	result_rel->rows = path->rows;
+	add_path(result_rel, path);
+	return result_rel;
 }
 
 /*
@@ -897,8 +921,6 @@ make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
 															   groupList,
 															   tlist),
 								 -1.0);
-		/* We have to manually jam the right tlist into the path; ick */
-		path->pathtarget = create_pathtarget(root, tlist);
 		path = (Path *) create_upper_unique_path(root,
 												 result_rel,
 												 path,
@@ -909,6 +931,24 @@ make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
 	return path;
 }
 
+/*
+ * postprocess_setop_rel - perform steps required after adding paths
+ */
+static void
+postprocess_setop_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	/*
+	 * We don't currently worry about allowing FDWs to contribute paths to
+	 * this relation, but give extensions a chance.
+	 */
+	if (create_upper_paths_hook)
+		(*create_upper_paths_hook) (root, UPPERREL_SETOP,
+									NULL, rel);
+
+	/* Select cheapest path */
+	set_cheapest(rel);
+}
+
 /*
  * choose_hashed_setop - should we use hashing for a set operation?
  */
-- 
2.14.3 (Apple Git-98)

0002-Rewrite-recurse_union_children-to-iterate-rather-tha.patchapplication/octet-stream; name=0002-Rewrite-recurse_union_children-to-iterate-rather-tha.patchDownload

From b62ca029a5fdb35f7e2060db51a84705200e841a Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Fri, 23 Feb 2018 12:13:46 -0500
Subject: [PATCH 2/4] Rewrite recurse_union_children to iterate, rather than
 recurse.

Also, rename it to plan_union_chidren, so the old name wasn't
very descriptive.  This results in a small net reduction in code,
seems at least to me to be easier to understand, and saves
space on the process stack.
---
 src/backend/optimizer/prep/prepunion.c | 100 ++++++++++++++++-----------------
 1 file changed, 47 insertions(+), 53 deletions(-)

diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index b586f941a8..f387387289 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -78,10 +78,10 @@ static Path *generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
 					   List *refnames_tlist,
 					   List **pTargetList,
 					   double *pNumGroups);
-static List *recurse_union_children(Node *setOp, PlannerInfo *root,
-					   SetOperationStmt *top_union,
-					   List *refnames_tlist,
-					   List **tlist_list);
+static List *plan_union_children(PlannerInfo *root,
+					SetOperationStmt *top_union,
+					List *refnames_tlist,
+					List **tlist_list);
 static Path *make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
 				  PlannerInfo *root);
 static bool choose_hashed_setop(PlannerInfo *root, List *groupClauses,
@@ -545,8 +545,6 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 	RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
 	double		save_fraction = root->tuple_fraction;
 	List	   *pathlist;
-	List	   *child_tlists1;
-	List	   *child_tlists2;
 	List	   *tlist_list;
 	List	   *tlist;
 	Path	   *path;
@@ -571,13 +569,7 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 	 * only one Append and unique-ification for the lot.  Recurse to find such
 	 * nodes and compute their children's paths.
 	 */
-	pathlist = list_concat(recurse_union_children(op->larg, root,
-												  op, refnames_tlist,
-												  &child_tlists1),
-						   recurse_union_children(op->rarg, root,
-												  op, refnames_tlist,
-												  &child_tlists2));
-	tlist_list = list_concat(child_tlists1, child_tlists2);
+	pathlist = plan_union_children(root, op, refnames_tlist, &tlist_list);
 
 	/*
 	 * Generate tlist for Append plan node.
@@ -797,56 +789,58 @@ generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
  * generate_union_path will force a fresh sort if the top level is a UNION.
  */
 static List *
-recurse_union_children(Node *setOp, PlannerInfo *root,
-					   SetOperationStmt *top_union,
-					   List *refnames_tlist,
-					   List **tlist_list)
+plan_union_children(PlannerInfo *root,
+					SetOperationStmt *top_union,
+					List *refnames_tlist,
+					List **tlist_list)
 {
-	List	   *result;
+	List	   *pending_rels = list_make1(top_union);
+	List	   *result = NIL;
 	List	   *child_tlist;
 
-	if (IsA(setOp, SetOperationStmt))
+	*tlist_list = NIL;
+
+	while (pending_rels != NIL)
 	{
-		SetOperationStmt *op = (SetOperationStmt *) setOp;
+		Node	   *setOp = linitial(pending_rels);
+
+		pending_rels = list_delete_first(pending_rels);
 
-		if (op->op == top_union->op &&
-			(op->all == top_union->all || op->all) &&
-			equal(op->colTypes, top_union->colTypes))
+		if (IsA(setOp, SetOperationStmt))
 		{
-			/* Same UNION, so fold children into parent's subpath list */
-			List	   *child_tlists1;
-			List	   *child_tlists2;
+			SetOperationStmt *op = (SetOperationStmt *) setOp;
 
-			result = list_concat(recurse_union_children(op->larg, root,
-														top_union,
-														refnames_tlist,
-														&child_tlists1),
-								 recurse_union_children(op->rarg, root,
-														top_union,
-														refnames_tlist,
-														&child_tlists2));
-			*tlist_list = list_concat(child_tlists1, child_tlists2);
-			return result;
+			if (op->op == top_union->op &&
+				(op->all == top_union->all || op->all) &&
+				equal(op->colTypes, top_union->colTypes))
+			{
+				/* Same UNION, so fold children into parent */
+				pending_rels = lcons(op->rarg, pending_rels);
+				pending_rels = lcons(op->larg, pending_rels);
+				continue;
+			}
 		}
+
+		/*
+		 * Not same, so plan this child separately.
+		 *
+		 * Note we disallow any resjunk columns in child results.  This is
+		 * necessary since the Append node that implements the union won't do
+		 * any projection, and upper levels will get confused if some of our
+		 * output tuples have junk and some don't.  This case only arises when
+		 * we have an EXCEPT or INTERSECT as child, else there won't be
+		 * resjunk anyway.
+		 */
+		result = lappend(result, recurse_set_operations(setOp, root,
+														top_union->colTypes,
+														top_union->colCollations,
+														false, -1,
+														refnames_tlist,
+														&child_tlist,
+														NULL));
+		*tlist_list = lappend(*tlist_list, child_tlist);
 	}
 
-	/*
-	 * Not same, so plan this child separately.
-	 *
-	 * Note we disallow any resjunk columns in child results.  This is
-	 * necessary since the Append node that implements the union won't do any
-	 * projection, and upper levels will get confused if some of our output
-	 * tuples have junk and some don't.  This case only arises when we have an
-	 * EXCEPT or INTERSECT as child, else there won't be resjunk anyway.
-	 */
-	result = list_make1(recurse_set_operations(setOp, root,
-											   top_union->colTypes,
-											   top_union->colCollations,
-											   false, -1,
-											   refnames_tlist,
-											   &child_tlist,
-											   NULL));
-	*tlist_list = list_make1(child_tlist);
 	return result;
 }
 
-- 
2.14.3 (Apple Git-98)

0001-Let-Parallel-Append-over-simple-UNION-ALL-have-parti.patchapplication/octet-stream; name=0001-Let-Parallel-Append-over-simple-UNION-ALL-have-parti.patchDownload

From 3d2c58b65aeecbd6484a31066ccc8b8f2ede583b Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Fri, 23 Feb 2018 11:53:07 -0500
Subject: [PATCH 1/4] Let Parallel Append over simple UNION ALL have partial
 subpaths.

A simple UNION ALL gets flattened into an appendrel of subquery
RTEs, but up until now it's been impossible for the appendrel to use
the partial paths for the subqueries, so we can implement the
appendrel as a Parallel Append but only one with non-partial paths
as children.

There are three separate obstacles to removing that limitation.
First, when planning a subquery, propagate any partial paths to the
final_rel so that they are potentially visible to outer query
levels.  Second, after planning a subquery, propagate any partial
paths for the final_rel to the subquery RTE in the outer query
level in the same way we do for non-partial paths.  Third, teach
finalize_plan() to account for the possibility that the fake
parameter we use for rescan signalling when the plan contains a
Gather (Merge) node may be propagated from an outer query level.

Patch by me.

Discussion: http://postgr.es/m/CA+Tgmoa6L9A1nNCk3aTDVZLZ4KkHDn1+tm7mFyFvP+uQPS7bAg@mail.gmail.com
---
 src/backend/optimizer/path/allpaths.c  | 22 ++++++++++++++++++++++
 src/backend/optimizer/plan/planner.c   | 16 ++++++++++++++++
 src/backend/optimizer/plan/subselect.c | 10 ++++++++--
 3 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index f714247ebb..f15bddc575 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -2179,6 +2179,28 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 				 create_subqueryscan_path(root, rel, subpath,
 										  pathkeys, required_outer));
 	}
+
+	/* If consider_parallel is false, there should be no partial paths. */
+	Assert(sub_final_rel->consider_parallel ||
+		   sub_final_rel->partial_pathlist == NIL);
+
+	/* Same for partial paths. */
+	foreach(lc, sub_final_rel->partial_pathlist)
+	{
+		Path	   *subpath = (Path *) lfirst(lc);
+		List	   *pathkeys;
+
+		/* Convert subpath's pathkeys to outer representation */
+		pathkeys = convert_subquery_pathkeys(root,
+											 rel,
+											 subpath->pathkeys,
+											 make_tlist_from_pathtarget(subpath->pathtarget));
+
+		/* Generate outer path using this subpath */
+		add_partial_path(rel, (Path *)
+						 create_subqueryscan_path(root, rel, subpath,
+												  pathkeys, required_outer));
+	}
 }
 
 /*
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 3e8cd1447c..6a4c5d35a6 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -2161,6 +2161,22 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 		add_path(final_rel, path);
 	}
 
+	/*
+	 * Generate partial paths for final_rel, too, if outer query levels might
+	 * be able to make use of them.
+	 */
+	if (final_rel->consider_parallel && root->query_level > 1 &&
+		!limit_needed(parse))
+	{
+		Assert(!parse->rowMarks && parse->commandType == CMD_SELECT);
+		foreach(lc, current_rel->partial_pathlist)
+		{
+			Path	   *partial_path = (Path *) lfirst(lc);
+
+			add_partial_path(final_rel, partial_path);
+		}
+	}
+
 	/*
 	 * If there is an FDW that's responsible for all baserels of the query,
 	 * let it consider adding ForeignPaths.
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 46367cba63..680ffd2cf9 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -2407,10 +2407,16 @@ finalize_plan(PlannerInfo *root, Plan *plan,
 			{
 				SubqueryScan *sscan = (SubqueryScan *) plan;
 				RelOptInfo *rel;
+				Bitmapset  *subquery_params;
 
-				/* We must run SS_finalize_plan on the subquery */
+				/* We must run finalize_plan on the subquery */
 				rel = find_base_rel(root, sscan->scan.scanrelid);
-				SS_finalize_plan(rel->subroot, sscan->subplan);
+				subquery_params = rel->subroot->outer_params;
+				if (gather_param >= 0)
+					subquery_params = bms_add_member(bms_copy(subquery_params),
+													 gather_param);
+				finalize_plan(rel->subroot, sscan->subplan, gather_param,
+							  subquery_params, NULL);
 
 				/* Now we can add its extParams to the parent's params */
 				context.paramids = bms_add_members(context.paramids,
-- 
2.14.3 (Apple Git-98)

Rajkumar Raghuwanshi

rajkumar.raghuwanshi@enterprisedb.com

almost 8 years ago

In reply to: Robert Haas (#2)

Re: parallel append vs. simple UNION ALL

On Sat, Feb 24, 2018 at 2:55 AM, Robert Haas <robertmhaas@gmail.com> wrote:

0001 is pretty much the same as the subquery-smarts.patch file I
attached to the previous email. I don't see much reason not to go
ahead and commit this, although it could use a test case. It makes
the simple/flattened case work. After some study I think that the
gather-parameter handling is correct, although if somebody felt like
reviewing that portion especially I wouldn't say no.

I have applied 0001 on pg-head, and while playing with regression tests, it
crashed with below test case.

postgres=# SET min_parallel_table_scan_size=0;
SET
postgres=# SELECT * FROM information_schema.foreign_data_wrapper_options
ORDER BY 1, 2, 3;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

--logfile
2018-02-26 22:06:07.331 IST [43508] LOG: database system is ready to
accept connections
TRAP: FailedAssertion("!(subpath->parallel_safe)", File: "pathnode.c",
Line: 1813)
2018-02-26 22:06:42.345 IST [43508] LOG: server process (PID 43519) was
terminated by signal 6: Aborted
2018-02-26 22:06:42.345 IST [43508] DETAIL: Failed process was running:
SELECT * FROM information_schema.foreign_data_wrapper_options ORDER BY 1,
2, 3;

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation

Amit Khandekar

amitdkhan.pg@gmail.com

almost 8 years ago

In reply to: Rajkumar Raghuwanshi (#3)

Re: parallel append vs. simple UNION ALL

On Sat, Feb 24, 2018 at 2:55 AM, Robert Haas <robertmhaas@gmail.com> wrote:

0001 is pretty much the same as the subquery-smarts.patch file I
attached to the previous email. I don't see much reason not to go
ahead and commit this, although it could use a test case. It makes
the simple/flattened case work. After some study I think that the
gather-parameter handling is correct, although if somebody felt like
reviewing that portion especially I wouldn't say no.

I had a look at 0001 patch. Other than the issue raised by Rajkumar,
it looks good functionally.

Regarding the finalize_plan() changes, I see that in the patch, the
Gather rescan param is now included in the valid_params while calling
finalize_plan() for the SubqueryScan, which looks correct. But I was
thinking that instead of doing that just before the recursive
finalize_plan(), it looks better if we do that at the initial section
of finalize_plan(). We already add initSetParam to valid_params. There
itself we can also add gather_params. Something like this :

@@ -2314,6 +2314,10 @@ finalize_plan(PlannerInfo *root, Plan *plan,
if (initSetParam)
valid_params = bms_union(valid_params, initSetParam);

+       /* Same applies for Gather rescan param */
+       if (gather_param >= 0)
+               valid_params = bms_add_member(valid_params, gather_param);

On 27 February 2018 at 16:51, Rajkumar Raghuwanshi
<rajkumar.raghuwanshi@enterprisedb.com> wrote:

I have applied 0001 on pg-head, and while playing with regression tests, it
crashed with below test case.

postgres=# SET min_parallel_table_scan_size=0;
SET
postgres=# SELECT * FROM information_schema.foreign_data_wrapper_options
ORDER BY 1, 2, 3;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

This is happening because there is a ProjectionPath as one of the
subpaths, and that is not parallel safe.

Sort
Sort Key: ((current_database())::information_schema.sql_identifier),
((w.fdwname)::information_schema.sql_identifier),
((((pg_options_to_table(w.fdwoptions))).option_name)::information_schema.sql_identifier)
-> Result
-> ProjectSet
-> Hash Join
Hash Cond: (w.fdwowner = u.oid)
-> Seq Scan on pg_foreign_data_wrapper w
Filter: (pg_has_role(fdwowner,
'USAGE'::text) OR has_foreign_data_wrapper_privilege(oid,
'USAGE'::text))
-> Hash
-> Seq Scan on pg_authid u

In grouping_planner() where partial paths are generated for final_rel,
we can skip non-parallel-safe paths.

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Rajkumar Raghuwanshi

rajkumar.raghuwanshi@enterprisedb.com

almost 8 years ago

In reply to: Robert Haas (#2)

Re: parallel append vs. simple UNION ALL

On Sat, Feb 24, 2018 at 2:55 AM, Robert Haas <robertmhaas@gmail.com> wrote:

0004 causes generate_union_path() to consider both the traditional
method and also Gather -> Parallel Append -> [partial path for each
subquery]. This is still a bit rough around the edges and there's a
lot more that could be done here, but I'm posting what I have for now
in the (perhaps vain) hope of getting some feedback. With this, you
can use Parallel Append for the UNION ALL step of a query like SELECT
.. UNION ALL .. SELECT ... EXCEPT SELECT ...

Hi,

With all 0001,0002,0003 and 0004 patch applied on head, I am getting a
strange crash, while trying to change table name
in a query by using "TAB" key.

Same test case working fine with only 0001 applied and also on PG-head.

below are steps to reproduce.

--run below sqls

SET parallel_setup_cost=0;
SET parallel_tuple_cost=0;
SET min_parallel_table_scan_size=0;
CREATE TABLE tbl_union_t1 (c1 INTEGER PRIMARY KEY,c2 INTEGER,c3 CHAR(10));
INSERT INTO tbl_union_t1 SELECT i, i % 125, to_char(i % 4, 'FM0000') FROM
generate_series(0, 499,2) i;
CREATE TABLE tbl_union_t2 (c1 INTEGER PRIMARY KEY,c2 INTEGER,c3 CHAR(10));
INSERT INTO tbl_union_t2 SELECT i, i % 125, to_char(i % 4, 'FM0000') FROM
generate_series(0, 499,3) i;
ANALYSE tbl_union_t1;
ANALYSE tbl_union_t2;

EXPLAIN SELECT AVG(c1),SUM(c2) FROM (SELECT c1,c2 FROM tbl_union_t1 EXCEPT
SELECT c1,c2 FROM tbl_union_t2 WHERE c1 % 25 =0 )UA;

--now try modifying tbl_union_t1 in the above query
--remove "_union_t1" and press TAB key, It crashed for me.

EXPLAIN SELECT AVG(c1),SUM(c2) FROM (SELECT c1,c2 FROM tbl*<PRESS TAB KEY
HERE>*EXCEPT SELECT c1,c2 FROM tbl_union_t2 WHERE c1 % 25 =0 )UA;

postgres=# EXPLAIN SELECT AVG(c1),SUM(c2) FROM (SELECT c1,c2 FROM
tblWARNING: terminating connection because of crash of another server
process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.

--logfile says something like this
2018-03-01 18:37:36.456 IST [50071] LOG: database system is ready to
accept connections
2018-03-01 18:38:38.668 IST [50071] LOG: background worker "parallel
worker" (PID 51703) was terminated by signal 11: Segmentation fault
2018-03-01 18:38:38.668 IST [50071] DETAIL: Failed process was running:
SELECT pg_catalog.quote_ident(c.relname) FROM pg_catalog.pg_class c WHERE
c.relkind IN ('r', 'S', 'v', 'm', 'f', 'p') AND
substring(pg_catalog.quote_ident(c.relname),1,3)='tbl' AND
pg_catalog.pg_table_is_visible(c.oid) AND c.relnamespace <> (SELECT oid
FROM pg_catalog.pg_namespace WHERE nspname = 'pg_catalog')
UNION
SELECT pg_catalog.quote_ident(n.nspname) || '.' FROM
pg_catalog.pg_namespace n WHERE substring(pg_catalog.quote_ident(n.nspname)
|| '.',1,3)='tbl' AND (SELECT pg_catalog.count(*) FROM
pg_catalog.pg_namespace WHERE substring(pg_catalog.quote_ident(nspname) ||
'.',1,3) =
substring('tbl',1,pg_catalog.length(pg_catalog.quote_ident(nspname))+1)) > 1
UNION
SELECT pg_catalog.quote_ident(n.nspname) || '.' ||
pg_catalog.quote_ident(c.relname) FROM pg_catalog.pg_class c,
pg_catalog.pg_namespace n WHERE c.relnamespace = n.oid AND c.relkind IN
('r', 'S', 'v', 'm', 'f', 'p') AND
substring(pg_catalog.quote_ident(n.nspname) || '.' ||
pg_catalog.quote_ident(c.relname),1,3)='tbl' AND
substring(pg_catalog.quote_id
2018-03-01 18:38:38.668 IST [50071] LOG: terminating any other active
server processes
2018-03-01 18:38:38.668 IST [50082] WARNING: terminating connection
because of crash of another server process
2018-03-01 18:38:38.668 IST [50082] DETAIL: The postmaster has commanded
this server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2018-03-01 18:38:38.668 IST [50082] HINT: In a moment you should be able
to reconnect to the database and repeat your command.
2018-03-01 18:38:38.670 IST [50076] WARNING: terminating connection
because of crash of another server process
2018-03-01 18:38:38.670 IST [50076] DETAIL: The postmaster has commanded
this server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2018-03-01 18:38:38.670 IST [50076] HINT: In a moment you should be able
to reconnect to the database and repeat your command.
2018-03-01 18:38:38.675 IST [50071] LOG: all server processes terminated;
reinitializing
2018-03-01 18:38:38.702 IST [51712] LOG: database system was interrupted;
last known up at 2018-03-01 18:37:36 IST
2018-03-01 18:38:38.723 IST [51712] LOG: database system was not properly
shut down; automatic recovery in progress
2018-03-01 18:38:38.724 IST [51712] LOG: redo starts at 0/1639510
2018-03-01 18:38:38.726 IST [51712] LOG: invalid record length at
0/1669488: wanted 24, got 0
2018-03-01 18:38:38.726 IST [51712] LOG: redo done at 0/1669420
2018-03-01 18:38:38.726 IST [51712] LOG: last completed transaction was at
log time 2018-03-01 18:38:36.53573+05:30
2018-03-01 18:38:38.744 IST [50071] LOG: database system is ready to
accept connections

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

almost 8 years ago

In reply to: Robert Haas (#2)

Re: parallel append vs. simple UNION ALL

On Sat, Feb 24, 2018 at 2:55 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Here's an extended series of patches that now handles both the simple
UNION ALL case (where we flatten it) and the unflattened case:

The patches look clean. I particularly looked at 0003.

patch 0001
+    /*
+     * Generate partial paths for final_rel, too, if outer query levels might
+     * be able to make use of them.
+     */
I am not able to understand the construct esp. the if clause. Did you want to
say "... if there are outer query levels. Those might ..." or something like
that?

0002
(op->all == top_union->all || op->all) &&
This isn't really your change. Checking
op->all is cheaper than checking equality, so may be we should check that first
and take advantage of short-circuit condition evaluation. If we do that above
condition reduces to (op->all || !top_union->all) which is two boolean
conditions, even cheaper. But may be the second optimization is not worth the
loss of readability.

"identically-propertied UNIONs" may be "UNIONs with identical properties".

0003
Probably we want to rename generate_union_path() as generate_union_rel() or
generate_union_paths() since the function doesn't return a path anymore.
Similarly for generate_nonunion_path().

In recurse_set_operations()
- return NULL; /* keep compiler quiet */
This line is deleted and instead rel is initialized to NULL. That way we loose
any chance to detect a future bug because of a block leaving rel uninitialized
through compiler warning. May be we should replace "return NULL" with "rel =
NULL", which will not be executed because of the error.

+    /* Build path list and relid set. */
+    foreach(lc, rellist)
+    {
With the changes in this patch, we could actually use add_paths_to_append_rel()
to create an append path. That function builds paths with different pathkeys,
parameterization (doesn't matter here) and also handles parallel append. So we
can avoid code duplication and also leverage more optimizations like using
MergeAppend instead of overall sort etc. But that function doesn't have ability
to add a final node like make_union_unique(). A similar requirement has arisen
in partition-wise join where we need to add a final node for finalising
aggregate on top of paths created by add_paths_to_append_rel().  May be we can
change that function to return a list of paths, which are then finalized by the
caller and added to "append" rel. But I don't think doing all that is in the
scope of this patch set.

0004
+        if (!op->all)
+            ppath = make_union_unique(op, ppath, tlist, root);
We could probably push the grouping/sorting down to the parallel workers. But
again not part of this patchset.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Robert Haas

robertmhaas@gmail.com

almost 8 years ago

In reply to: Rajkumar Raghuwanshi (#3)

Re: parallel append vs. simple UNION ALL

On Tue, Feb 27, 2018 at 6:21 AM, Rajkumar Raghuwanshi
<rajkumar.raghuwanshi@enterprisedb.com> wrote:

I have applied 0001 on pg-head, and while playing with regression tests, it
crashed with below test case.

postgres=# SET min_parallel_table_scan_size=0;
SET
postgres=# SELECT * FROM information_schema.foreign_data_wrapper_options
ORDER BY 1, 2, 3;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

Hmm, nice catch. I think part of the problem here is that commit
69f4b9c85f168ae006929eec44fc44d569e846b9, wherein Andres introduced
the ProjectSet node, didn't really do the right thing in terms of
testing parallel-safety. Before that commit, is_parallel_safe(root,
(Node *) scanjoin_target->exprs)) was really testing whether the tlist
produced during the scan/join phase was parallel-safe. However, after
that commit, scanjoin_target->exprs wasn't really the final target for
the scan/join phase any more; instead, it was the first of possibly
several targets computed by split_pathtarget_at_srfs(). Really, the
right thing to do is to test the *last* element in that list for
parallel-safety, but as the code stands we end up testing the *first*
element. So, if there's a parallel-restricted item in the target list
(this query ends up putting a CoerceToDomain in the target list, which
we currently consider parallel-restricted), it looks we can
nevertheless end up trying to project it in what is supposed to be a
partial path.

There are a large number of cases where this doesn't end up mattering
because the next upper_rel created will not get marked
consider_parallel because its target list will also contain the same
parallel-restricted construct, and therefore the partial paths
generated for the scan/join rel will never get used -- except to
generate Gather/Gather Merge paths, which has already happened; but
that step didn't know about the rest of the scan/join targets either,
so it won't have used them. However, both create_distinct_paths() and
the code in grouping_planner that populates final_rel assume that they
don't need to retest the target for parallel-safety because no
projection is done at those levels; they just inherit the
parallel-safety marking of the input rel, so in those cases if the
input rel's marking is wrong the result is populated upward.

There's another way final_rel->consider_parallel can be wrong, too: if
the FROM-list is empty, then we create a join rel and set its
consider_parallel flag without regard to the parallel-safety of the
target list. There are comments in query_planner() says that this
will be dealt with "later", but this seems not to be true. (Before
Tom's commit da1c91631e3577ea5818f855ebb5bd206d559006, the comments
simply ignored the question of whether a check was needed here, but
Tom seems to have inserted an incorrect justification for the
already-wrong code.)

I'm not sure to what degree, if at all, any of these problems are
visible given that we don't use final_rel->consider_parallel for much
of anything. Certainly, it gets much easier to trigger a problem with
0001 applied, as the test case shows. But I'm not entirely convinced
that there's no problem even without that. It seems like every upper
rel that is setting its consider_parallel flag based on the first
element of some list of targets rather than the last is potentially
vulnerable to ending up with the wrong answer, and I'm afraid that
might have some adverse consequence that I haven't quite pinned down
yet.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Rajkumar Raghuwanshi

rajkumar.raghuwanshi@enterprisedb.com

almost 8 years ago

In reply to: Robert Haas (#7)

Re: parallel append vs. simple UNION ALL

Hi,

With 0001 applied on PG-head, I got reference leak warning and later a
server crash.
this crash is reproducible with enable_parallel_append=off also.
below is the test case to reproduce this.

SET enable_parallel_append=off;
SET parallel_setup_cost=0;
SET parallel_tuple_cost=0;
SET min_parallel_table_scan_size=0;

CREATE TABLE foo (a numeric PRIMARY KEY);
INSERT INTO foo VALUES (1);
INSERT INTO foo VALUES (2);
ANALYZE foo;

CREATE TABLE bar (a numeric PRIMARY KEY);
INSERT INTO bar VALUES (6);
INSERT INTO bar VALUES (7);
ANALYZE bar;

Select a from foo where a=(select * from foo where a=1)
UNION All
SELECT a FROM bar where a=(select * from bar where a=1);

/*
postgres=# Select a from foo where a=(select * from foo where a=1)
UNION All
SELECT a FROM bar where a=(select * from bar where a=1);
WARNING: relcache reference leak: relation "foo" not closed
WARNING: relcache reference leak: relation "bar" not closed
WARNING: Snapshot reference leak: Snapshot 0x22f9168 still referenced
WARNING: Snapshot reference leak: Snapshot 0x22f9298 still referenced
a
---
1
(1 row)

postgres=# Select a from foo where a=(select * from foo where a=1)
UNION All
SELECT a FROM bar where a=(select * from bar where a=1);
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>
*/

Stack-trace for the crash is given below :

/*
Loaded symbols for /lib64/libnss_files.so.2
Core was generated by `postgres: parallel worker for PID
7574 '.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000712283 in ExecProcNode (node=0x0) at
../../../src/include/executor/executor.h:236
236 if (node->chgParam != NULL) /* something changed? */
Missing separate debuginfos, use: debuginfo-install
keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64
libcom_err-1.41.12-23.el6.x86_64 libselinux-2.0.94-7.el6.x86_64
openssl-1.0.1e-57.el6.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0 0x0000000000712283 in ExecProcNode (node=0x0) at
../../../src/include/executor/executor.h:236
#1 0x0000000000714304 in ExecSetParamPlan (node=0x2311dd0,
econtext=0x2312660) at nodeSubplan.c:1056
#2 0x00000000006ce19f in ExecEvalParamExec (state=0x231ab10, op=0x231ac28,
econtext=0x2312660) at execExprInterp.c:2225
#3 0x00000000006cc106 in ExecInterpExpr (state=0x231ab10,
econtext=0x2312660, isnull=0x7ffddfed6e47 "") at execExprInterp.c:1024
#4 0x00000000006cd828 in ExecInterpExprStillValid (state=0x231ab10,
econtext=0x2312660, isNull=0x7ffddfed6e47 "") at execExprInterp.c:1819
#5 0x00000000006e02a1 in ExecEvalExprSwitchContext (state=0x231ab10,
econtext=0x2312660, isNull=0x7ffddfed6e47 "") at
../../../src/include/executor/executor.h:305
#6 0x00000000006e038e in ExecQual (state=0x231ab10, econtext=0x2312660) at
../../../src/include/executor/executor.h:374
#7 0x00000000006e066b in ExecScan (node=0x2311cb8, accessMtd=0x70e4dc
<SeqNext>, recheckMtd=0x70e5b3 <SeqRecheck>) at execScan.c:190
#8 0x000000000070e600 in ExecSeqScan (pstate=0x2311cb8) at
nodeSeqscan.c:129
#9 0x00000000006deac2 in ExecProcNodeFirst (node=0x2311cb8) at
execProcnode.c:446
#10 0x00000000006e9219 in ExecProcNode (node=0x2311cb8) at
../../../src/include/executor/executor.h:239
#11 0x00000000006e94a1 in ExecAppend (pstate=0x23117a8) at nodeAppend.c:203
#12 0x00000000006deac2 in ExecProcNodeFirst (node=0x23117a8) at
execProcnode.c:446
#13 0x00000000006d5469 in ExecProcNode (node=0x23117a8) at
../../../src/include/executor/executor.h:239
#14 0x00000000006d7dc2 in ExecutePlan (estate=0x2310e08,
planstate=0x23117a8, use_parallel_mode=0 '\000', operation=CMD_SELECT,
sendTuples=1 '\001', numberTuples=0,
direction=ForwardScanDirection, dest=0x22ea108, execute_once=1 '\001')
at execMain.c:1721
#15 0x00000000006d5a3b in standard_ExecutorRun (queryDesc=0x22ff1d8,
direction=ForwardScanDirection, count=0, execute_once=1 '\001') at
execMain.c:361
#16 0x00000000006d5857 in ExecutorRun (queryDesc=0x22ff1d8,
direction=ForwardScanDirection, count=0, execute_once=1 '\001') at
execMain.c:304
#17 0x00000000006dcb39 in ParallelQueryMain (seg=0x22561d0,
toc=0x7f49097c0000) at execParallel.c:1313
#18 0x0000000000535cdb in ParallelWorkerMain (main_arg=737000409) at
parallel.c:1397
#19 0x00000000007fcb8d in StartBackgroundWorker () at bgworker.c:841
#20 0x000000000080ffb1 in do_start_bgworker (rw=0x227cea0) at
postmaster.c:5741
#21 0x000000000081031d in maybe_start_bgworkers () at postmaster.c:5954
#22 0x000000000080f382 in sigusr1_handler (postgres_signal_arg=10) at
postmaster.c:5134
#23 <signal handler called>
#24 0x0000003dd26e1603 in __select_nocancel () at
../sysdeps/unix/syscall-template.S:82
#25 0x000000000080ab76 in ServerLoop () at postmaster.c:1721
#26 0x000000000080a365 in PostmasterMain (argc=3, argv=0x2254180) at
postmaster.c:1365
#27 0x000000000073f0b0 in main (argc=3, argv=0x2254180) at main.c:228
*/

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation

On Tue, Mar 6, 2018 at 12:27 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Show quoted text

On Tue, Feb 27, 2018 at 6:21 AM, Rajkumar Raghuwanshi
<rajkumar.raghuwanshi@enterprisedb.com> wrote:

I have applied 0001 on pg-head, and while playing with regression tests,

it

crashed with below test case.

postgres=# SET min_parallel_table_scan_size=0;
SET
postgres=# SELECT * FROM information_schema.foreign_data_wrapper_options
ORDER BY 1, 2, 3;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

Hmm, nice catch. I think part of the problem here is that commit
69f4b9c85f168ae006929eec44fc44d569e846b9, wherein Andres introduced
the ProjectSet node, didn't really do the right thing in terms of
testing parallel-safety. Before that commit, is_parallel_safe(root,
(Node *) scanjoin_target->exprs)) was really testing whether the tlist
produced during the scan/join phase was parallel-safe. However, after
that commit, scanjoin_target->exprs wasn't really the final target for
the scan/join phase any more; instead, it was the first of possibly
several targets computed by split_pathtarget_at_srfs(). Really, the
right thing to do is to test the *last* element in that list for
parallel-safety, but as the code stands we end up testing the *first*
element. So, if there's a parallel-restricted item in the target list
(this query ends up putting a CoerceToDomain in the target list, which
we currently consider parallel-restricted), it looks we can
nevertheless end up trying to project it in what is supposed to be a
partial path.

There are a large number of cases where this doesn't end up mattering
because the next upper_rel created will not get marked
consider_parallel because its target list will also contain the same
parallel-restricted construct, and therefore the partial paths
generated for the scan/join rel will never get used -- except to
generate Gather/Gather Merge paths, which has already happened; but
that step didn't know about the rest of the scan/join targets either,
so it won't have used them. However, both create_distinct_paths() and
the code in grouping_planner that populates final_rel assume that they
don't need to retest the target for parallel-safety because no
projection is done at those levels; they just inherit the
parallel-safety marking of the input rel, so in those cases if the
input rel's marking is wrong the result is populated upward.

There's another way final_rel->consider_parallel can be wrong, too: if
the FROM-list is empty, then we create a join rel and set its
consider_parallel flag without regard to the parallel-safety of the
target list. There are comments in query_planner() says that this
will be dealt with "later", but this seems not to be true. (Before
Tom's commit da1c91631e3577ea5818f855ebb5bd206d559006, the comments
simply ignored the question of whether a check was needed here, but
Tom seems to have inserted an incorrect justification for the
already-wrong code.)

I'm not sure to what degree, if at all, any of these problems are
visible given that we don't use final_rel->consider_parallel for much
of anything. Certainly, it gets much easier to trigger a problem with
0001 applied, as the test case shows. But I'm not entirely convinced
that there's no problem even without that. It seems like every upper
rel that is setting its consider_parallel flag based on the first
element of some list of targets rather than the last is potentially
vulnerable to ending up with the wrong answer, and I'm afraid that
might have some adverse consequence that I haven't quite pinned down
yet.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Robert Haas

robertmhaas@gmail.com

almost 8 years ago

In reply to: Rajkumar Raghuwanshi (#8)

5 attachment(s)

Re: parallel append vs. simple UNION ALL

On Wed, Mar 7, 2018 at 5:36 AM, Rajkumar Raghuwanshi
<rajkumar.raghuwanshi@enterprisedb.com> wrote:

With 0001 applied on PG-head, I got reference leak warning and later a
server crash.
this crash is reproducible with enable_parallel_append=off also.
below is the test case to reproduce this.

New patches attached, fixing all 3 of the issues you reported:

0001 is a new patch to fix the incorrect parallel safety marks on
upper relations. I don't know of a visible effect of this patch by
itself, but there might be one.

0002 is the same as the old 0001, but I made a fix in
SS_charge_for_initplans() which fixed your most recent crash report.
Either this or the previous change also fixed the crash you saw when
using tab-completion. Also, I added some test cases based on your
failing examples.

0003-0005 are the same as the old 0002-0004.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

0005-Consider-Parallel-Append-as-a-way-to-implement-a-uni.patchapplication/octet-stream; name=0005-Consider-Parallel-Append-as-a-way-to-implement-a-uni.patchDownload

From 8cde9c92e038659ada5f591ab7f9d28d6189e2f7 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Sat, 23 Dec 2017 10:56:07 -0800
Subject: [PATCH 5/5] Consider Parallel Append as a way to implement a union
 operation in a setop tree.

---
 src/backend/optimizer/prep/prepunion.c | 93 +++++++++++++++++++++++++++++++++-
 1 file changed, 91 insertions(+), 2 deletions(-)

diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 6e5524046f..2f259ab0fe 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -298,12 +298,18 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		 */
 		set_subquery_size_estimates(root, rel);
 
+		/*
+		 * Since we may want to add a partial path to this relation, we must
+		 * set its consider_parallel flag correctly.
+		 */
+		final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
+		rel->consider_parallel = final_rel->consider_parallel;
+
 		/*
 		 * For the moment, we consider only a single Path for the subquery.
 		 * This should change soon (make it look more like
 		 * set_subquery_pathlist).
 		 */
-		final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
 		subpath = get_cheapest_fractional_path(final_rel,
 											   root->tuple_fraction);
 
@@ -320,6 +326,23 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 
 		add_path(rel, path);
 
+		/*
+		 * If we have a partial path for the child relation, we can use that
+		 * to build a partial path for this relation.  But there's no point in
+		 * considering any path but the cheapest.
+		 */
+		if (final_rel->partial_pathlist != NIL)
+		{
+			Path	   *partial_subpath;
+			Path	   *partial_path;
+
+			partial_subpath = linitial(final_rel->partial_pathlist);
+			partial_path = (Path *)
+				create_subqueryscan_path(root, rel, partial_subpath,
+										 NIL, NULL);
+			add_partial_path(rel, partial_path);
+		}
+
 		/*
 		 * Estimate number of groups if caller wants it.  If the subquery used
 		 * grouping or aggregation, its output is probably mostly unique
@@ -552,6 +575,9 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 	double		save_fraction = root->tuple_fraction;
 	ListCell   *lc;
 	List	   *pathlist = NIL;
+	List	   *partial_pathlist = NIL;
+	bool		partial_paths_valid = true;
+	bool		consider_parallel = true;
 	List	   *rellist;
 	List	   *tlist_list;
 	List	   *tlist;
@@ -591,18 +617,34 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 
 	*pTargetList = tlist;
 
-	/* Build path list and relid set. */
+	/* Build path lists and relid set. */
 	foreach(lc, rellist)
 	{
 		RelOptInfo *rel = lfirst(lc);
 
 		pathlist = lappend(pathlist, rel->cheapest_total_path);
+
+		if (consider_parallel)
+		{
+			if (!rel->consider_parallel)
+			{
+				consider_parallel = false;
+				partial_paths_valid = false;
+			}
+			else if (rel->partial_pathlist == NIL)
+				partial_paths_valid = false;
+			else
+				partial_pathlist = lappend(partial_pathlist,
+										   linitial(rel->partial_pathlist));
+		}
+
 		relids = bms_union(relids, rel->relids);
 	}
 
 	/* Build result relation. */
 	result_rel = fetch_upper_rel(root, UPPERREL_SETOP, relids);
 	result_rel->reltarget = create_pathtarget(root, tlist);
+	result_rel->consider_parallel = consider_parallel;
 
 	/*
 	 * Append the child results together.
@@ -626,6 +668,53 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 	 */
 	result_rel->rows = path->rows;
 
+	/*
+	 * Now consider doing the same thing using the partial paths plus Append
+	 * plus Gather.
+	 */
+	if (partial_paths_valid)
+	{
+		Path	   *ppath;
+		ListCell   *lc;
+		int			parallel_workers = 0;
+
+		/* Find the highest number of workers requested for any subpath. */
+		foreach(lc, partial_pathlist)
+		{
+			Path	   *path = lfirst(lc);
+
+			parallel_workers = Max(parallel_workers, path->parallel_workers);
+		}
+		Assert(parallel_workers > 0);
+
+		/*
+		 * If the use of parallel append is permitted, always request at least
+		 * log2(# of children) paths.  We assume it can be useful to have
+		 * extra workers in this case because they will be spread out across
+		 * the children.  The precise formula is just a guess; see
+		 * add_paths_to_append_rel.
+		 */
+		if (enable_parallel_append)
+		{
+			parallel_workers = Max(parallel_workers,
+								   fls(list_length(partial_pathlist)));
+			parallel_workers = Min(parallel_workers,
+								   max_parallel_workers_per_gather);
+		}
+		Assert(parallel_workers > 0);
+
+		ppath = (Path *)
+			create_append_path(result_rel, NIL, partial_pathlist,
+							   NULL, parallel_workers, enable_parallel_append,
+							   NIL /* XXX? Is this right? */ , -1);
+		ppath = (Path *)
+			create_gather_path(root, result_rel, ppath,
+							   result_rel->reltarget, NULL, NULL);
+		if (!op->all)
+			ppath = make_union_unique(op, ppath, tlist, root);
+		add_path(result_rel, ppath);
+	}
+
 	/* Undo effects of possibly forcing tuple_fraction to 0 */
 	root->tuple_fraction = save_fraction;
 
-- 
2.14.3 (Apple Git-98)

0004-Generate-a-separate-upper-relation-for-each-stage-of.patchapplication/octet-stream; name=0004-Generate-a-separate-upper-relation-for-each-stage-of.patchDownload

From 7b79170661a9255b6237e7d3fe01a33a75c9ecc3 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Fri, 23 Feb 2018 15:42:51 -0500
Subject: [PATCH 4/5] Generate a separate upper relation for each stage of
 setop planning.

Commit 3fc6e2d7f5b652b417fa6937c34de2438d60fa9f made setop planning
stages return paths rather than plans, but all such paths were loosely
associated with a single RelOptInfo, and only the final path was added
to the RelOptInfo.  Even at the time, it was foreseen that this should
be changed, because there is otherwise no good way for a single stage
of setop planning to return multiple paths.  With this patch, each
stage of set operation planning now creates a separate RelOptInfo;
these are distinguished by using appropriate relid sets.  Note that
this patch does nothing whatsoever about actually returning multiple
paths for the same set operation; it just makes it possible for a
future patch to do so.

Along the way, adjust things so that create_upper_paths_hook is called
for each of these new RelOptInfos rather than just once, since that
might be useful to extensions using that hook.  It might be a good
to provide an FDW API here as well, but I didn't try to do that for
now.

Patch by me.
---
 src/backend/optimizer/prep/prepunion.c | 330 ++++++++++++++++++---------------
 1 file changed, 185 insertions(+), 145 deletions(-)

diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index f387387289..6e5524046f 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -60,30 +60,29 @@ typedef struct
 	AppendRelInfo **appinfos;
 } adjust_appendrel_attrs_context;
 
-static Path *recurse_set_operations(Node *setOp, PlannerInfo *root,
+static RelOptInfo *recurse_set_operations(Node *setOp, PlannerInfo *root,
 					   List *colTypes, List *colCollations,
 					   bool junkOK,
 					   int flag, List *refnames_tlist,
 					   List **pTargetList,
 					   double *pNumGroups);
-static Path *generate_recursion_path(SetOperationStmt *setOp,
+static RelOptInfo *generate_recursion_path(SetOperationStmt *setOp,
 						PlannerInfo *root,
 						List *refnames_tlist,
 						List **pTargetList);
-static Path *generate_union_path(SetOperationStmt *op, PlannerInfo *root,
+static RelOptInfo *generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 					List *refnames_tlist,
-					List **pTargetList,
-					double *pNumGroups);
-static Path *generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
+					List **pTargetList);
+static RelOptInfo *generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
 					   List *refnames_tlist,
-					   List **pTargetList,
-					   double *pNumGroups);
+					   List **pTargetList);
 static List *plan_union_children(PlannerInfo *root,
 					SetOperationStmt *top_union,
 					List *refnames_tlist,
 					List **tlist_list);
 static Path *make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
 				  PlannerInfo *root);
+static void postprocess_setop_rel(PlannerInfo *root, RelOptInfo *rel);
 static bool choose_hashed_setop(PlannerInfo *root, List *groupClauses,
 					Path *input_path,
 					double dNumGroups, double dNumOutputRows,
@@ -149,7 +148,6 @@ plan_set_operations(PlannerInfo *root)
 	RangeTblEntry *leftmostRTE;
 	Query	   *leftmostQuery;
 	RelOptInfo *setop_rel;
-	Path	   *path;
 	List	   *top_tlist;
 
 	Assert(topop);
@@ -181,57 +179,34 @@ plan_set_operations(PlannerInfo *root)
 	leftmostQuery = leftmostRTE->subquery;
 	Assert(leftmostQuery != NULL);
 
-	/*
-	 * We return our results in the (SETOP, NULL) upperrel.  For the moment,
-	 * this is also the parent rel of all Paths in the setop tree; we may well
-	 * change that in future.
-	 */
-	setop_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
-
-	/*
-	 * We don't currently worry about setting setop_rel's consider_parallel
-	 * flag, nor about allowing FDWs to contribute paths to it.
-	 */
-
 	/*
 	 * If the topmost node is a recursive union, it needs special processing.
 	 */
 	if (root->hasRecursion)
 	{
-		path = generate_recursion_path(topop, root,
-									   leftmostQuery->targetList,
-									   &top_tlist);
+		setop_rel = generate_recursion_path(topop, root,
+											leftmostQuery->targetList,
+											&top_tlist);
 	}
 	else
 	{
 		/*
 		 * Recurse on setOperations tree to generate paths for set ops. The
-		 * final output path should have just the column types shown as the
+		 * final output paths should have just the column types shown as the
 		 * output from the top-level node, plus possibly resjunk working
 		 * columns (we can rely on upper-level nodes to deal with that).
 		 */
-		path = recurse_set_operations((Node *) topop, root,
-									  topop->colTypes, topop->colCollations,
-									  true, -1,
-									  leftmostQuery->targetList,
-									  &top_tlist,
-									  NULL);
+		setop_rel = recurse_set_operations((Node *) topop, root,
+										   topop->colTypes, topop->colCollations,
+										   true, -1,
+										   leftmostQuery->targetList,
+										   &top_tlist,
+										   NULL);
 	}
 
 	/* Must return the built tlist into root->processed_tlist. */
 	root->processed_tlist = top_tlist;
 
-	/* Add only the final path to the SETOP upperrel. */
-	add_path(setop_rel, path);
-
-	/* Let extensions possibly add some more paths */
-	if (create_upper_paths_hook)
-		(*create_upper_paths_hook) (root, UPPERREL_SETOP,
-									NULL, setop_rel);
-
-	/* Select cheapest path */
-	set_cheapest(setop_rel);
-
 	return setop_rel;
 }
 
@@ -245,21 +220,21 @@ plan_set_operations(PlannerInfo *root)
  * flag: if >= 0, add a resjunk output column indicating value of flag
  * refnames_tlist: targetlist to take column names from
  *
- * Returns a path for the subtree, as well as these output parameters:
+ * Returns a RelOptInfo for the subtree, as well as these output parameters:
  * *pTargetList: receives the fully-fledged tlist for the subtree's top plan
  * *pNumGroups: if not NULL, we estimate the number of distinct groups
  *		in the result, and store it there
  *
  * The pTargetList output parameter is mostly redundant with the pathtarget
- * of the returned path, but for the moment we need it because much of the
- * logic in this file depends on flag columns being marked resjunk.  Pending
- * a redesign of how that works, this is the easy way out.
+ * of the returned RelOptInfo, but for the moment we need it because much of
+ * the logic in this file depends on flag columns being marked resjunk.
+ * Pending a redesign of how that works, this is the easy way out.
  *
  * We don't have to care about typmods here: the only allowed difference
  * between set-op input and output typmods is input is a specific typmod
  * and output is -1, and that does not require a coercion.
  */
-static Path *
+static RelOptInfo *
 recurse_set_operations(Node *setOp, PlannerInfo *root,
 					   List *colTypes, List *colCollations,
 					   bool junkOK,
@@ -267,6 +242,8 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 					   List **pTargetList,
 					   double *pNumGroups)
 {
+	RelOptInfo *rel = NULL;		/* keep compiler quiet */
+
 	/* Guard against stack overflow due to overly complex setop nests */
 	check_stack_depth();
 
@@ -275,7 +252,6 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		RangeTblRef *rtr = (RangeTblRef *) setOp;
 		RangeTblEntry *rte = root->simple_rte_array[rtr->rtindex];
 		Query	   *subquery = rte->subquery;
-		RelOptInfo *rel;
 		PlannerInfo *subroot;
 		RelOptInfo *final_rel;
 		Path	   *subpath;
@@ -284,11 +260,7 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 
 		Assert(subquery != NULL);
 
-		/*
-		 * We need to build a RelOptInfo for each leaf subquery.  This isn't
-		 * used for much here, but it carries the subroot data structures
-		 * forward to setrefs.c processing.
-		 */
+		/* Build a RelOptInfo for this leaf subquery. */
 		rel = build_simple_rel(root, rtr->rtindex, NULL);
 
 		/* plan_params should not be in use in current query level */
@@ -307,6 +279,18 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		if (root->plan_params)
 			elog(ERROR, "unexpected outer reference in set operation subquery");
 
+		/* Figure out the appropriate target list for this subquery. */
+		tlist = generate_setop_tlist(colTypes, colCollations,
+									 flag,
+									 rtr->rtindex,
+									 true,
+									 subroot->processed_tlist,
+									 refnames_tlist);
+		rel->reltarget = create_pathtarget(root, tlist);
+
+		/* Return the fully-fledged tlist to caller, too */
+		*pTargetList = tlist;
+
 		/*
 		 * Mark rel with estimated output rows, width, etc.  Note that we have
 		 * to do this before generating outer-query paths, else
@@ -334,22 +318,7 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		path = (Path *) create_subqueryscan_path(root, rel, subpath,
 												 NIL, NULL);
 
-		/*
-		 * Figure out the appropriate target list, and update the
-		 * SubqueryScanPath with the PathTarget form of that.
-		 */
-		tlist = generate_setop_tlist(colTypes, colCollations,
-									 flag,
-									 rtr->rtindex,
-									 true,
-									 subroot->processed_tlist,
-									 refnames_tlist);
-
-		path = apply_projection_to_path(root, rel, path,
-										create_pathtarget(root, tlist));
-
-		/* Return the fully-fledged tlist to caller, too */
-		*pTargetList = tlist;
+		add_path(rel, path);
 
 		/*
 		 * Estimate number of groups if caller wants it.  If the subquery used
@@ -378,25 +347,22 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 												  subpath->rows,
 												  NULL);
 		}
-
-		return (Path *) path;
 	}
 	else if (IsA(setOp, SetOperationStmt))
 	{
 		SetOperationStmt *op = (SetOperationStmt *) setOp;
-		Path	   *path;
 
 		/* UNIONs are much different from INTERSECT/EXCEPT */
 		if (op->op == SETOP_UNION)
-			path = generate_union_path(op, root,
-									   refnames_tlist,
-									   pTargetList,
-									   pNumGroups);
+			rel = generate_union_path(op, root,
+									  refnames_tlist,
+									  pTargetList);
 		else
-			path = generate_nonunion_path(op, root,
-										  refnames_tlist,
-										  pTargetList,
-										  pNumGroups);
+			rel = generate_nonunion_path(op, root,
+										 refnames_tlist,
+										 pTargetList);
+		if (pNumGroups)
+			*pNumGroups = rel->rows;
 
 		/*
 		 * If necessary, add a Result node to project the caller-requested
@@ -415,39 +381,70 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 			!tlist_same_datatypes(*pTargetList, colTypes, junkOK) ||
 			!tlist_same_collations(*pTargetList, colCollations, junkOK))
 		{
+			PathTarget *target;
+			ListCell   *lc;
+
 			*pTargetList = generate_setop_tlist(colTypes, colCollations,
 												flag,
 												0,
 												false,
 												*pTargetList,
 												refnames_tlist);
-			path = apply_projection_to_path(root,
-											path->parent,
-											path,
-											create_pathtarget(root,
-															  *pTargetList));
+			target = create_pathtarget(root, *pTargetList);
+
+			/* Apply projection to each path */
+			foreach(lc, rel->pathlist)
+			{
+				Path	   *subpath = (Path *) lfirst(lc);
+				Path	   *path;
+
+				Assert(subpath->param_info == NULL);
+				path = apply_projection_to_path(root, subpath->parent,
+												subpath, target);
+				/* If we had to add a Result, path is different from subpath */
+				if (path != subpath)
+					lfirst(lc) = path;
+			}
+
+			/* Apply projection to each partial path */
+			foreach(lc, rel->partial_pathlist)
+			{
+				Path	   *subpath = (Path *) lfirst(lc);
+				Path	   *path;
+
+				Assert(subpath->param_info == NULL);
+
+				/* avoid apply_projection_to_path, in case of multiple refs */
+				path = (Path *) create_projection_path(root, subpath->parent,
+													   subpath, target);
+				lfirst(lc) = path;
+			}
 		}
-		return path;
 	}
 	else
 	{
 		elog(ERROR, "unrecognized node type: %d",
 			 (int) nodeTag(setOp));
 		*pTargetList = NIL;
-		return NULL;			/* keep compiler quiet */
 	}
+
+	postprocess_setop_rel(root, rel);
+
+	return rel;
 }
 
 /*
- * Generate path for a recursive UNION node
+ * Generate paths for a recursive UNION node
  */
-static Path *
+static RelOptInfo *
 generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 						List *refnames_tlist,
 						List **pTargetList)
 {
-	RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
+	RelOptInfo *result_rel;
 	Path	   *path;
+	RelOptInfo *lrel,
+			   *rrel;
 	Path	   *lpath;
 	Path	   *rpath;
 	List	   *lpath_tlist;
@@ -466,20 +463,22 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 	 * Unlike a regular UNION node, process the left and right inputs
 	 * separately without any intention of combining them into one Append.
 	 */
-	lpath = recurse_set_operations(setOp->larg, root,
-								   setOp->colTypes, setOp->colCollations,
-								   false, -1,
-								   refnames_tlist,
-								   &lpath_tlist,
-								   NULL);
+	lrel = recurse_set_operations(setOp->larg, root,
+								  setOp->colTypes, setOp->colCollations,
+								  false, -1,
+								  refnames_tlist,
+								  &lpath_tlist,
+								  NULL);
+	lpath = lrel->cheapest_total_path;
 	/* The right path will want to look at the left one ... */
 	root->non_recursive_path = lpath;
-	rpath = recurse_set_operations(setOp->rarg, root,
-								   setOp->colTypes, setOp->colCollations,
-								   false, -1,
-								   refnames_tlist,
-								   &rpath_tlist,
-								   NULL);
+	rrel = recurse_set_operations(setOp->rarg, root,
+								  setOp->colTypes, setOp->colCollations,
+								  false, -1,
+								  refnames_tlist,
+								  &rpath_tlist,
+								  NULL);
+	rpath = rrel->cheapest_total_path;
 	root->non_recursive_path = NULL;
 
 	/*
@@ -491,6 +490,11 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 
 	*pTargetList = tlist;
 
+	/* Build result relation. */
+	result_rel = fetch_upper_rel(root, UPPERREL_SETOP,
+								 bms_union(lrel->relids, rrel->relids));
+	result_rel->reltarget = create_pathtarget(root, tlist);
+
 	/*
 	 * If UNION, identify the grouping operators
 	 */
@@ -525,26 +529,30 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 											   result_rel,
 											   lpath,
 											   rpath,
-											   create_pathtarget(root, tlist),
+											   result_rel->reltarget,
 											   groupList,
 											   root->wt_param_id,
 											   dNumGroups);
 
-	return path;
+	add_path(result_rel, path);
+	postprocess_setop_rel(root, result_rel);
+	return result_rel;
 }
 
 /*
- * Generate path for a UNION or UNION ALL node
+ * Generate paths for a UNION or UNION ALL node
  */
-static Path *
+static RelOptInfo *
 generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 					List *refnames_tlist,
-					List **pTargetList,
-					double *pNumGroups)
+					List **pTargetList)
 {
-	RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
+	Relids		relids = NULL;
+	RelOptInfo *result_rel;
 	double		save_fraction = root->tuple_fraction;
-	List	   *pathlist;
+	ListCell   *lc;
+	List	   *pathlist = NIL;
+	List	   *rellist;
 	List	   *tlist_list;
 	List	   *tlist;
 	Path	   *path;
@@ -569,7 +577,7 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 	 * only one Append and unique-ification for the lot.  Recurse to find such
 	 * nodes and compute their children's paths.
 	 */
-	pathlist = plan_union_children(root, op, refnames_tlist, &tlist_list);
+	rellist = plan_union_children(root, op, refnames_tlist, &tlist_list);
 
 	/*
 	 * Generate tlist for Append plan node.
@@ -583,13 +591,24 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 
 	*pTargetList = tlist;
 
+	/* Build path list and relid set. */
+	foreach(lc, rellist)
+	{
+		RelOptInfo *rel = lfirst(lc);
+
+		pathlist = lappend(pathlist, rel->cheapest_total_path);
+		relids = bms_union(relids, rel->relids);
+	}
+
+	/* Build result relation. */
+	result_rel = fetch_upper_rel(root, UPPERREL_SETOP, relids);
+	result_rel->reltarget = create_pathtarget(root, tlist);
+
 	/*
 	 * Append the child results together.
 	 */
 	path = (Path *) create_append_path(result_rel, pathlist, NIL,
 									   NULL, 0, false, NIL, -1);
-	/* We have to manually jam the right tlist into the path; ick */
-	path->pathtarget = create_pathtarget(root, tlist);
 
 	/*
 	 * For UNION ALL, we just need the Append path.  For UNION, need to add
@@ -598,30 +617,32 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 	if (!op->all)
 		path = make_union_unique(op, path, tlist, root);
 
+	add_path(result_rel, path);
+
 	/*
-	 * Estimate number of groups if caller wants it.  For now we just assume
-	 * the output is unique --- this is certainly true for the UNION case, and
-	 * we want worst-case estimates anyway.
+	 * Estimate number of groups.  For now we just assume the output is unique
+	 * --- this is certainly true for the UNION case, and we want worst-case
+	 * estimates anyway.
 	 */
-	if (pNumGroups)
-		*pNumGroups = path->rows;
+	result_rel->rows = path->rows;
 
 	/* Undo effects of possibly forcing tuple_fraction to 0 */
 	root->tuple_fraction = save_fraction;
 
-	return path;
+	return result_rel;
 }
 
 /*
- * Generate path for an INTERSECT, INTERSECT ALL, EXCEPT, or EXCEPT ALL node
+ * Generate paths for an INTERSECT, INTERSECT ALL, EXCEPT, or EXCEPT ALL node
  */
-static Path *
+static RelOptInfo *
 generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
 					   List *refnames_tlist,
-					   List **pTargetList,
-					   double *pNumGroups)
+					   List **pTargetList)
 {
-	RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
+	RelOptInfo *result_rel;
+	RelOptInfo *lrel,
+			   *rrel;
 	double		save_fraction = root->tuple_fraction;
 	Path	   *lpath,
 			   *rpath,
@@ -646,18 +667,20 @@ generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
 	root->tuple_fraction = 0.0;
 
 	/* Recurse on children, ensuring their outputs are marked */
-	lpath = recurse_set_operations(op->larg, root,
-								   op->colTypes, op->colCollations,
-								   false, 0,
-								   refnames_tlist,
-								   &lpath_tlist,
-								   &dLeftGroups);
-	rpath = recurse_set_operations(op->rarg, root,
-								   op->colTypes, op->colCollations,
-								   false, 1,
-								   refnames_tlist,
-								   &rpath_tlist,
-								   &dRightGroups);
+	lrel = recurse_set_operations(op->larg, root,
+								  op->colTypes, op->colCollations,
+								  false, 0,
+								  refnames_tlist,
+								  &lpath_tlist,
+								  &dLeftGroups);
+	lpath = lrel->cheapest_total_path;
+	rrel = recurse_set_operations(op->rarg, root,
+								  op->colTypes, op->colCollations,
+								  false, 1,
+								  refnames_tlist,
+								  &rpath_tlist,
+								  &dRightGroups);
+	rpath = rrel->cheapest_total_path;
 
 	/* Undo effects of forcing tuple_fraction to 0 */
 	root->tuple_fraction = save_fraction;
@@ -695,15 +718,17 @@ generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
 
 	*pTargetList = tlist;
 
+	/* Build result relation. */
+	result_rel = fetch_upper_rel(root, UPPERREL_SETOP,
+								 bms_union(lrel->relids, rrel->relids));
+	result_rel->reltarget = create_pathtarget(root, tlist);;
+
 	/*
 	 * Append the child results together.
 	 */
 	path = (Path *) create_append_path(result_rel, pathlist, NIL,
 									   NULL, 0, false, NIL, -1);
 
-	/* We have to manually jam the right tlist into the path; ick */
-	path->pathtarget = create_pathtarget(root, tlist);
-
 	/* Identify the grouping semantics */
 	groupList = generate_setop_grouplist(op, tlist);
 
@@ -769,10 +794,9 @@ generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
 									  dNumGroups,
 									  dNumOutputRows);
 
-	if (pNumGroups)
-		*pNumGroups = dNumGroups;
-
-	return path;
+	result_rel->rows = path->rows;
+	add_path(result_rel, path);
+	return result_rel;
 }
 
 /*
@@ -897,8 +921,6 @@ make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
 															   groupList,
 															   tlist),
 								 -1.0);
-		/* We have to manually jam the right tlist into the path; ick */
-		path->pathtarget = create_pathtarget(root, tlist);
 		path = (Path *) create_upper_unique_path(root,
 												 result_rel,
 												 path,
@@ -909,6 +931,24 @@ make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
 	return path;
 }
 
+/*
+ * postprocess_setop_rel - perform steps required after adding paths
+ */
+static void
+postprocess_setop_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	/*
+	 * We don't currently worry about allowing FDWs to contribute paths to
+	 * this relation, but give extensions a chance.
+	 */
+	if (create_upper_paths_hook)
+		(*create_upper_paths_hook) (root, UPPERREL_SETOP,
+									NULL, rel);
+
+	/* Select cheapest path */
+	set_cheapest(rel);
+}
+
 /*
  * choose_hashed_setop - should we use hashing for a set operation?
  */
-- 
2.14.3 (Apple Git-98)

0003-Rewrite-recurse_union_children-to-iterate-rather-tha.patchapplication/octet-stream; name=0003-Rewrite-recurse_union_children-to-iterate-rather-tha.patchDownload

From 19a7ad8b89d9a1fc11c105343bde9e01df31a204 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Fri, 23 Feb 2018 12:13:46 -0500
Subject: [PATCH 3/5] Rewrite recurse_union_children to iterate, rather than
 recurse.

Also, rename it to plan_union_chidren, so the old name wasn't
very descriptive.  This results in a small net reduction in code,
seems at least to me to be easier to understand, and saves
space on the process stack.
---
 src/backend/optimizer/prep/prepunion.c | 100 ++++++++++++++++-----------------
 1 file changed, 47 insertions(+), 53 deletions(-)

diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index b586f941a8..f387387289 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -78,10 +78,10 @@ static Path *generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
 					   List *refnames_tlist,
 					   List **pTargetList,
 					   double *pNumGroups);
-static List *recurse_union_children(Node *setOp, PlannerInfo *root,
-					   SetOperationStmt *top_union,
-					   List *refnames_tlist,
-					   List **tlist_list);
+static List *plan_union_children(PlannerInfo *root,
+					SetOperationStmt *top_union,
+					List *refnames_tlist,
+					List **tlist_list);
 static Path *make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
 				  PlannerInfo *root);
 static bool choose_hashed_setop(PlannerInfo *root, List *groupClauses,
@@ -545,8 +545,6 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 	RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
 	double		save_fraction = root->tuple_fraction;
 	List	   *pathlist;
-	List	   *child_tlists1;
-	List	   *child_tlists2;
 	List	   *tlist_list;
 	List	   *tlist;
 	Path	   *path;
@@ -571,13 +569,7 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 	 * only one Append and unique-ification for the lot.  Recurse to find such
 	 * nodes and compute their children's paths.
 	 */
-	pathlist = list_concat(recurse_union_children(op->larg, root,
-												  op, refnames_tlist,
-												  &child_tlists1),
-						   recurse_union_children(op->rarg, root,
-												  op, refnames_tlist,
-												  &child_tlists2));
-	tlist_list = list_concat(child_tlists1, child_tlists2);
+	pathlist = plan_union_children(root, op, refnames_tlist, &tlist_list);
 
 	/*
 	 * Generate tlist for Append plan node.
@@ -797,56 +789,58 @@ generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
  * generate_union_path will force a fresh sort if the top level is a UNION.
  */
 static List *
-recurse_union_children(Node *setOp, PlannerInfo *root,
-					   SetOperationStmt *top_union,
-					   List *refnames_tlist,
-					   List **tlist_list)
+plan_union_children(PlannerInfo *root,
+					SetOperationStmt *top_union,
+					List *refnames_tlist,
+					List **tlist_list)
 {
-	List	   *result;
+	List	   *pending_rels = list_make1(top_union);
+	List	   *result = NIL;
 	List	   *child_tlist;
 
-	if (IsA(setOp, SetOperationStmt))
+	*tlist_list = NIL;
+
+	while (pending_rels != NIL)
 	{
-		SetOperationStmt *op = (SetOperationStmt *) setOp;
+		Node	   *setOp = linitial(pending_rels);
+
+		pending_rels = list_delete_first(pending_rels);
 
-		if (op->op == top_union->op &&
-			(op->all == top_union->all || op->all) &&
-			equal(op->colTypes, top_union->colTypes))
+		if (IsA(setOp, SetOperationStmt))
 		{
-			/* Same UNION, so fold children into parent's subpath list */
-			List	   *child_tlists1;
-			List	   *child_tlists2;
+			SetOperationStmt *op = (SetOperationStmt *) setOp;
 
-			result = list_concat(recurse_union_children(op->larg, root,
-														top_union,
-														refnames_tlist,
-														&child_tlists1),
-								 recurse_union_children(op->rarg, root,
-														top_union,
-														refnames_tlist,
-														&child_tlists2));
-			*tlist_list = list_concat(child_tlists1, child_tlists2);
-			return result;
+			if (op->op == top_union->op &&
+				(op->all == top_union->all || op->all) &&
+				equal(op->colTypes, top_union->colTypes))
+			{
+				/* Same UNION, so fold children into parent */
+				pending_rels = lcons(op->rarg, pending_rels);
+				pending_rels = lcons(op->larg, pending_rels);
+				continue;
+			}
 		}
+
+		/*
+		 * Not same, so plan this child separately.
+		 *
+		 * Note we disallow any resjunk columns in child results.  This is
+		 * necessary since the Append node that implements the union won't do
+		 * any projection, and upper levels will get confused if some of our
+		 * output tuples have junk and some don't.  This case only arises when
+		 * we have an EXCEPT or INTERSECT as child, else there won't be
+		 * resjunk anyway.
+		 */
+		result = lappend(result, recurse_set_operations(setOp, root,
+														top_union->colTypes,
+														top_union->colCollations,
+														false, -1,
+														refnames_tlist,
+														&child_tlist,
+														NULL));
+		*tlist_list = lappend(*tlist_list, child_tlist);
 	}
 
-	/*
-	 * Not same, so plan this child separately.
-	 *
-	 * Note we disallow any resjunk columns in child results.  This is
-	 * necessary since the Append node that implements the union won't do any
-	 * projection, and upper levels will get confused if some of our output
-	 * tuples have junk and some don't.  This case only arises when we have an
-	 * EXCEPT or INTERSECT as child, else there won't be resjunk anyway.
-	 */
-	result = list_make1(recurse_set_operations(setOp, root,
-											   top_union->colTypes,
-											   top_union->colCollations,
-											   false, -1,
-											   refnames_tlist,
-											   &child_tlist,
-											   NULL));
-	*tlist_list = list_make1(child_tlist);
 	return result;
 }
 
-- 
2.14.3 (Apple Git-98)

0002-Let-Parallel-Append-over-simple-UNION-ALL-have-parti.patchapplication/octet-stream; name=0002-Let-Parallel-Append-over-simple-UNION-ALL-have-parti.patchDownload

From 5e8b16bc7ef7043216ea6d48608973302b630870 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Fri, 23 Feb 2018 11:53:07 -0500
Subject: [PATCH 2/5] Let Parallel Append over simple UNION ALL have partial
 subpaths.

A simple UNION ALL gets flattened into an appendrel of subquery
RTEs, but up until now it's been impossible for the appendrel to use
the partial paths for the subqueries, so we can implement the
appendrel as a Parallel Append but only one with non-partial paths
as children.

There are three separate obstacles to removing that limitation.
First, when planning a subquery, propagate any partial paths to the
final_rel so that they are potentially visible to outer query levels
(but not if they have initPlans attached, because that wouldn't be
safe).  Second, after planning a subquery, propagate any partial paths
for the final_rel to the subquery RTE in the outer query level in the
same way we do for non-partial paths.  Third, teach finalize_plan() to
account for the possibility that the fake parameter we use for rescan
signalling when the plan contains a Gather (Merge) node may be
propagated from an outer query level.

Patch by me, reviewed and tested by Amit Khandekar and Rajkumar
Raghuwanshi.  Test cases based on examples by Rajkumar Raghuwanshi.

Discussion: http://postgr.es/m/CA+Tgmoa6L9A1nNCk3aTDVZLZ4KkHDn1+tm7mFyFvP+uQPS7bAg@mail.gmail.com
---
 src/backend/optimizer/path/allpaths.c         | 22 +++++++++
 src/backend/optimizer/plan/planner.c          | 16 +++++++
 src/backend/optimizer/plan/subselect.c        | 17 ++++++-
 src/test/regress/expected/select_parallel.out | 65 +++++++++++++++++++++++++++
 src/test/regress/sql/select_parallel.sql      | 25 +++++++++++
 5 files changed, 143 insertions(+), 2 deletions(-)

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 1c792a00eb..ea4e683abb 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -2179,6 +2179,28 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 				 create_subqueryscan_path(root, rel, subpath,
 										  pathkeys, required_outer));
 	}
+
+	/* If consider_parallel is false, there should be no partial paths. */
+	Assert(sub_final_rel->consider_parallel ||
+		   sub_final_rel->partial_pathlist == NIL);
+
+	/* Same for partial paths. */
+	foreach(lc, sub_final_rel->partial_pathlist)
+	{
+		Path	   *subpath = (Path *) lfirst(lc);
+		List	   *pathkeys;
+
+		/* Convert subpath's pathkeys to outer representation */
+		pathkeys = convert_subquery_pathkeys(root,
+											 rel,
+											 subpath->pathkeys,
+											 make_tlist_from_pathtarget(subpath->pathtarget));
+
+		/* Generate outer path using this subpath */
+		add_partial_path(rel, (Path *)
+						 create_subqueryscan_path(root, rel, subpath,
+												  pathkeys, required_outer));
+	}
 }
 
 /*
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 14b7becf3e..91b32d4947 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -2194,6 +2194,22 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 		add_path(final_rel, path);
 	}
 
+	/*
+	 * Generate partial paths for final_rel, too, if outer query levels might
+	 * be able to make use of them.
+	 */
+	if (final_rel->consider_parallel && root->query_level > 1 &&
+		!limit_needed(parse))
+	{
+		Assert(!parse->rowMarks && parse->commandType == CMD_SELECT);
+		foreach(lc, current_rel->partial_pathlist)
+		{
+			Path	   *partial_path = (Path *) lfirst(lc);
+
+			add_partial_path(final_rel, partial_path);
+		}
+	}
+
 	/*
 	 * If there is an FDW that's responsible for all baserels of the query,
 	 * let it consider adding ForeignPaths.
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 46367cba63..32bf1fd955 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -2202,6 +2202,13 @@ SS_charge_for_initplans(PlannerInfo *root, RelOptInfo *final_rel)
 		path->parallel_safe = false;
 	}
 
+	/*
+	 * Forget about any partial paths and clear consider_parallel, too;
+	 * they're not usable if we attached an initPlan.
+	 */
+	final_rel->partial_pathlist = NIL;
+	final_rel->consider_parallel = false;
+
 	/* We needn't do set_cheapest() here, caller will do it */
 }
 
@@ -2407,10 +2414,16 @@ finalize_plan(PlannerInfo *root, Plan *plan,
 			{
 				SubqueryScan *sscan = (SubqueryScan *) plan;
 				RelOptInfo *rel;
+				Bitmapset  *subquery_params;
 
-				/* We must run SS_finalize_plan on the subquery */
+				/* We must run finalize_plan on the subquery */
 				rel = find_base_rel(root, sscan->scan.scanrelid);
-				SS_finalize_plan(rel->subroot, sscan->subplan);
+				subquery_params = rel->subroot->outer_params;
+				if (gather_param >= 0)
+					subquery_params = bms_add_member(bms_copy(subquery_params),
+													 gather_param);
+				finalize_plan(rel->subroot, sscan->subplan, gather_param,
+							  subquery_params, NULL);
 
 				/* Now we can add its extParams to the parent's params */
 				context.paramids = bms_add_members(context.paramids,
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 0a78261638..2fb16d1a15 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -890,4 +890,69 @@ select stringu1::int2 from tenk1 where unique1 = 1;
 ERROR:  invalid input syntax for integer: "BAAAAA"
 CONTEXT:  parallel worker
 ROLLBACK TO SAVEPOINT settings;
+-- test interaction with set-returning functions
+SAVEPOINT settings;
+-- multiple subqueries under a single Gather node
+-- must set parallel_setup_cost > 0 to discourage multiple Gather nodes
+SET LOCAL parallel_setup_cost = 10;
+EXPLAIN (COSTS OFF)
+SELECT unique1 FROM tenk1 WHERE fivethous = tenthous + 1
+UNION ALL
+SELECT unique1 FROM tenk1 WHERE fivethous = tenthous + 1;
+                     QUERY PLAN                     
+----------------------------------------------------
+ Gather
+   Workers Planned: 4
+   ->  Parallel Append
+         ->  Parallel Seq Scan on tenk1
+               Filter: (fivethous = (tenthous + 1))
+         ->  Parallel Seq Scan on tenk1 tenk1_1
+               Filter: (fivethous = (tenthous + 1))
+(7 rows)
+
+ROLLBACK TO SAVEPOINT settings;
+-- can't use multiple subqueries under a single Gather node due to initPlans
+EXPLAIN (COSTS OFF)
+SELECT unique1 FROM tenk1 WHERE fivethous =
+	(SELECT unique1 FROM tenk1 WHERE fivethous = 1 LIMIT 1)
+UNION ALL
+SELECT unique1 FROM tenk1 WHERE fivethous =
+	(SELECT unique2 FROM tenk1 WHERE fivethous = 1 LIMIT 1)
+ORDER BY 1;
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Sort
+   Sort Key: tenk1.unique1
+   ->  Append
+         ->  Gather
+               Workers Planned: 4
+               Params Evaluated: $1
+               InitPlan 1 (returns $1)
+                 ->  Limit
+                       ->  Gather
+                             Workers Planned: 4
+                             ->  Parallel Seq Scan on tenk1 tenk1_2
+                                   Filter: (fivethous = 1)
+               ->  Parallel Seq Scan on tenk1
+                     Filter: (fivethous = $1)
+         ->  Gather
+               Workers Planned: 4
+               Params Evaluated: $3
+               InitPlan 2 (returns $3)
+                 ->  Limit
+                       ->  Gather
+                             Workers Planned: 4
+                             ->  Parallel Seq Scan on tenk1 tenk1_3
+                                   Filter: (fivethous = 1)
+               ->  Parallel Seq Scan on tenk1 tenk1_1
+                     Filter: (fivethous = $3)
+(25 rows)
+
+-- test interaction with SRFs
+SELECT * FROM information_schema.foreign_data_wrapper_options
+ORDER BY 1, 2, 3;
+ foreign_data_wrapper_catalog | foreign_data_wrapper_name | option_name | option_value 
+------------------------------+---------------------------+-------------+--------------
+(0 rows)
+
 rollback;
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index fa03aae0c0..ec817f2a4c 100644
--- a/src/test/regress/sql/select_parallel.sql
+++ b/src/test/regress/sql/select_parallel.sql
@@ -358,4 +358,29 @@ SET LOCAL force_parallel_mode = 1;
 select stringu1::int2 from tenk1 where unique1 = 1;
 ROLLBACK TO SAVEPOINT settings;
 
+-- test interaction with set-returning functions
+SAVEPOINT settings;
+
+-- multiple subqueries under a single Gather node
+-- must set parallel_setup_cost > 0 to discourage multiple Gather nodes
+SET LOCAL parallel_setup_cost = 10;
+EXPLAIN (COSTS OFF)
+SELECT unique1 FROM tenk1 WHERE fivethous = tenthous + 1
+UNION ALL
+SELECT unique1 FROM tenk1 WHERE fivethous = tenthous + 1;
+ROLLBACK TO SAVEPOINT settings;
+
+-- can't use multiple subqueries under a single Gather node due to initPlans
+EXPLAIN (COSTS OFF)
+SELECT unique1 FROM tenk1 WHERE fivethous =
+	(SELECT unique1 FROM tenk1 WHERE fivethous = 1 LIMIT 1)
+UNION ALL
+SELECT unique1 FROM tenk1 WHERE fivethous =
+	(SELECT unique2 FROM tenk1 WHERE fivethous = 1 LIMIT 1)
+ORDER BY 1;
+
+-- test interaction with SRFs
+SELECT * FROM information_schema.foreign_data_wrapper_options
+ORDER BY 1, 2, 3;
+
 rollback;
-- 
2.14.3 (Apple Git-98)

0001-Correctly-assess-parallel-safety-of-tlists-when-SRFs.patchapplication/octet-stream; name=0001-Correctly-assess-parallel-safety-of-tlists-when-SRFs.patchDownload

From f6b01233a9f02cae694220c86d0ca00297b278e5 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Wed, 7 Mar 2018 12:23:24 -0500
Subject: [PATCH 1/5] Correctly assess parallel-safety of tlists when SRFs are
 used.

Since commit 69f4b9c85f168ae006929eec44fc44d569e846b9, the existing
code was no longer assessing the parallel-safety of the real tlist
for each upper rel, but rather the first of possibly several tlists
created by split_pathtarget_at_srfs().  Repair.

Patch by me, per a report from Rajkumar Raghuwanshi.
---
 src/backend/optimizer/plan/planner.c | 52 ++++++++++++++++++++++++++++++------
 1 file changed, 44 insertions(+), 8 deletions(-)

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index de1257d9c2..14b7becf3e 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -137,6 +137,7 @@ static Size estimate_hashagg_tablesize(Path *path,
 static RelOptInfo *create_grouping_paths(PlannerInfo *root,
 					  RelOptInfo *input_rel,
 					  PathTarget *target,
+					  bool target_parallel_safe,
 					  const AggClauseCosts *agg_costs,
 					  grouping_sets_data *gd);
 static void consider_groupingsets_paths(PlannerInfo *root,
@@ -152,6 +153,7 @@ static RelOptInfo *create_window_paths(PlannerInfo *root,
 					RelOptInfo *input_rel,
 					PathTarget *input_target,
 					PathTarget *output_target,
+					bool output_target_parallel_safe,
 					List *tlist,
 					WindowFuncLists *wflists,
 					List *activeWindows);
@@ -168,6 +170,7 @@ static RelOptInfo *create_distinct_paths(PlannerInfo *root,
 static RelOptInfo *create_ordered_paths(PlannerInfo *root,
 					 RelOptInfo *input_rel,
 					 PathTarget *target,
+					 bool target_parallel_safe,
 					 double limit_tuples);
 static PathTarget *make_group_input_target(PlannerInfo *root,
 						PathTarget *final_target);
@@ -1583,6 +1586,7 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 	PathTarget *final_target;
 	List	   *final_targets;
 	List	   *final_targets_contain_srfs;
+	bool		final_target_parallel_safe;
 	RelOptInfo *current_rel;
 	RelOptInfo *final_rel;
 	ListCell   *lc;
@@ -1645,6 +1649,10 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 		/* Also extract the PathTarget form of the setop result tlist */
 		final_target = current_rel->cheapest_total_path->pathtarget;
 
+		/* And check whether it's parallel safe */
+		final_target_parallel_safe =
+			is_parallel_safe(root, (Node *) final_target->exprs);
+
 		/* The setop result tlist couldn't contain any SRFs */
 		Assert(!parse->hasTargetSRFs);
 		final_targets = final_targets_contain_srfs = NIL;
@@ -1676,12 +1684,15 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 		PathTarget *sort_input_target;
 		List	   *sort_input_targets;
 		List	   *sort_input_targets_contain_srfs;
+		bool		sort_input_target_parallel_safe;
 		PathTarget *grouping_target;
 		List	   *grouping_targets;
 		List	   *grouping_targets_contain_srfs;
+		bool		grouping_target_parallel_safe;
 		PathTarget *scanjoin_target;
 		List	   *scanjoin_targets;
 		List	   *scanjoin_targets_contain_srfs;
+		bool		scanjoin_target_parallel_safe;
 		bool		have_grouping;
 		AggClauseCosts agg_costs;
 		WindowFuncLists *wflists = NULL;
@@ -1805,6 +1816,8 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 		 * that were obtained within query_planner().
 		 */
 		final_target = create_pathtarget(root, tlist);
+		final_target_parallel_safe =
+			is_parallel_safe(root, (Node *) final_target->exprs);
 
 		/*
 		 * If ORDER BY was given, consider whether we should use a post-sort
@@ -1812,11 +1825,18 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 		 * so.
 		 */
 		if (parse->sortClause)
+		{
 			sort_input_target = make_sort_input_target(root,
 													   final_target,
 													   &have_postponed_srfs);
+			sort_input_target_parallel_safe =
+				is_parallel_safe(root, (Node *) sort_input_target->exprs);
+		}
 		else
+		{
 			sort_input_target = final_target;
+			sort_input_target_parallel_safe = final_target_parallel_safe;
+		}
 
 		/*
 		 * If we have window functions to deal with, the output from any
@@ -1824,11 +1844,18 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 		 * otherwise, it should be sort_input_target.
 		 */
 		if (activeWindows)
+		{
 			grouping_target = make_window_input_target(root,
 													   final_target,
 													   activeWindows);
+			grouping_target_parallel_safe =
+				is_parallel_safe(root, (Node *) grouping_target->exprs);
+		}
 		else
+		{
 			grouping_target = sort_input_target;
+			grouping_target_parallel_safe = sort_input_target_parallel_safe;
+		}
 
 		/*
 		 * If we have grouping or aggregation to do, the topmost scan/join
@@ -1838,9 +1865,16 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 		have_grouping = (parse->groupClause || parse->groupingSets ||
 						 parse->hasAggs || root->hasHavingQual);
 		if (have_grouping)
+		{
 			scanjoin_target = make_group_input_target(root, final_target);
+			scanjoin_target_parallel_safe =
+				is_parallel_safe(root, (Node *) grouping_target->exprs);
+		}
 		else
+		{
 			scanjoin_target = grouping_target;
+			scanjoin_target_parallel_safe = grouping_target_parallel_safe;
+		}
 
 		/*
 		 * If there are any SRFs in the targetlist, we must separate each of
@@ -1922,8 +1956,7 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 		 * for partial paths.  But only parallel-safe expressions can be
 		 * computed by partial paths.
 		 */
-		if (current_rel->partial_pathlist &&
-			is_parallel_safe(root, (Node *) scanjoin_target->exprs))
+		if (current_rel->partial_pathlist && scanjoin_target_parallel_safe)
 		{
 			/* Apply the scan/join target to each partial path */
 			foreach(lc, current_rel->partial_pathlist)
@@ -1984,6 +2017,7 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 			current_rel = create_grouping_paths(root,
 												current_rel,
 												grouping_target,
+												grouping_target_parallel_safe,
 												&agg_costs,
 												gset_data);
 			/* Fix things up if grouping_target contains SRFs */
@@ -2003,6 +2037,7 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 											  current_rel,
 											  grouping_target,
 											  sort_input_target,
+											  sort_input_target_parallel_safe,
 											  tlist,
 											  wflists,
 											  activeWindows);
@@ -2036,6 +2071,7 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 		current_rel = create_ordered_paths(root,
 										   current_rel,
 										   final_target,
+										   final_target_parallel_safe,
 										   have_postponed_srfs ? -1.0 :
 										   limit_tuples);
 		/* Fix things up if final_target contains SRFs */
@@ -3623,6 +3659,7 @@ static RelOptInfo *
 create_grouping_paths(PlannerInfo *root,
 					  RelOptInfo *input_rel,
 					  PathTarget *target,
+					  bool target_parallel_safe,
 					  const AggClauseCosts *agg_costs,
 					  grouping_sets_data *gd)
 {
@@ -3652,8 +3689,7 @@ create_grouping_paths(PlannerInfo *root,
 	 * target list and HAVING quals are parallel-safe.  The partially grouped
 	 * relation obeys the same rules.
 	 */
-	if (input_rel->consider_parallel &&
-		is_parallel_safe(root, (Node *) target->exprs) &&
+	if (input_rel->consider_parallel && target_parallel_safe &&
 		is_parallel_safe(root, (Node *) parse->havingQual))
 	{
 		grouped_rel->consider_parallel = true;
@@ -4230,6 +4266,7 @@ create_window_paths(PlannerInfo *root,
 					RelOptInfo *input_rel,
 					PathTarget *input_target,
 					PathTarget *output_target,
+					bool output_target_parallel_safe,
 					List *tlist,
 					WindowFuncLists *wflists,
 					List *activeWindows)
@@ -4245,8 +4282,7 @@ create_window_paths(PlannerInfo *root,
 	 * can't be parallel-safe, either.  Otherwise, we need to examine the
 	 * target list and active windows for non-parallel-safe constructs.
 	 */
-	if (input_rel->consider_parallel &&
-		is_parallel_safe(root, (Node *) output_target->exprs) &&
+	if (input_rel->consider_parallel && output_target_parallel_safe &&
 		is_parallel_safe(root, (Node *) activeWindows))
 		window_rel->consider_parallel = true;
 
@@ -4621,6 +4657,7 @@ static RelOptInfo *
 create_ordered_paths(PlannerInfo *root,
 					 RelOptInfo *input_rel,
 					 PathTarget *target,
+					 bool target_parallel_safe,
 					 double limit_tuples)
 {
 	Path	   *cheapest_input_path = input_rel->cheapest_total_path;
@@ -4635,8 +4672,7 @@ create_ordered_paths(PlannerInfo *root,
 	 * can't be parallel-safe, either.  Otherwise, it's parallel-safe if the
 	 * target list is parallel-safe.
 	 */
-	if (input_rel->consider_parallel &&
-		is_parallel_safe(root, (Node *) target->exprs))
+	if (input_rel->consider_parallel && target_parallel_safe)
 		ordered_rel->consider_parallel = true;
 
 	/*
-- 
2.14.3 (Apple Git-98)

#10

Rajkumar Raghuwanshi

rajkumar.raghuwanshi@enterprisedb.com

almost 8 years ago

In reply to: Robert Haas (#9)

Re: parallel append vs. simple UNION ALL

On Thu, Mar 8, 2018 at 12:27 AM, Robert Haas <robertmhaas@gmail.com> wrote:

New patches attached, fixing all 3 of the issues you reported:

Thanks. new patches applied cleanly on head and fixing all reported issue.

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation

#11

Robert Haas

robertmhaas@gmail.com

almost 8 years ago

In reply to: Rajkumar Raghuwanshi (#10)

Re: parallel append vs. simple UNION ALL

On Thu, Mar 8, 2018 at 2:46 AM, Rajkumar Raghuwanshi
<rajkumar.raghuwanshi@enterprisedb.com> wrote:

On Thu, Mar 8, 2018 at 12:27 AM, Robert Haas <robertmhaas@gmail.com> wrote:

New patches attached, fixing all 3 of the issues you reported:

Thanks. new patches applied cleanly on head and fixing all reported issue.

Great. Committed 0001. Are you planning any further testing of this
patch series?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#12

Robert Haas

robertmhaas@gmail.com

almost 8 years ago

In reply to: Ashutosh Bapat (#6)

4 attachment(s)

Re: parallel append vs. simple UNION ALL

On Thu, Mar 1, 2018 at 8:30 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

The patches look clean. I particularly looked at 0003.

patch 0001
+    /*
+     * Generate partial paths for final_rel, too, if outer query levels might
+     * be able to make use of them.
+     */
I am not able to understand the construct esp. the if clause. Did you want to
say "... if there are outer query levels. Those might ..." or something like
that?

Well, that's what I meant, but I didn't think it was necessary to
spell it out in quite that much detail.

0002
(op->all == top_union->all || op->all) &&
This isn't really your change. Checking
op->all is cheaper than checking equality, so may be we should check that first
and take advantage of short-circuit condition evaluation. If we do that above
condition reduces to (op->all || !top_union->all) which is two boolean
conditions, even cheaper. But may be the second optimization is not worth the
loss of readability.

I doubt this makes any difference. The compiler should be smart
enough to do this the same way regardless of exactly how we write it,
and if it's not, it can't make more than a few instructions worth of
difference. This code is not nearly performance-critical enough for
that to matter. Also, it's not the job of this patch to whack this
around.

"identically-propertied UNIONs" may be "UNIONs with identical properties".

Likewise, I didn't write those words, so I don't plan to change them
just because I might have written them differently.

0003
Probably we want to rename generate_union_path() as generate_union_rel() or
generate_union_paths() since the function doesn't return a path anymore.
Similarly for generate_nonunion_path().

Good point. Changed.

In recurse_set_operations()
- return NULL; /* keep compiler quiet */
This line is deleted and instead rel is initialized to NULL. That way we loose
any chance to detect a future bug because of a block leaving rel uninitialized
through compiler warning. May be we should replace "return NULL" with "rel =
NULL", which will not be executed because of the error.

I don't think this is really a problem. If some code path fails to
initialize rel, it's going to crash when postprocess_setop_rel() calls
set_cheapest(). Any warning the compiler gives is more likely to be a
false-positive than an actual problem.

+    /* Build path list and relid set. */
+    foreach(lc, rellist)
+    {
With the changes in this patch, we could actually use add_paths_to_append_rel()
to create an append path. That function builds paths with different pathkeys,
parameterization (doesn't matter here) and also handles parallel append. So we
can avoid code duplication and also leverage more optimizations like using
MergeAppend instead of overall sort etc. But that function doesn't have ability
to add a final node like make_union_unique(). A similar requirement has arisen
in partition-wise join where we need to add a final node for finalising
aggregate on top of paths created by add_paths_to_append_rel().  May be we can
change that function to return a list of paths, which are then finalized by the
caller and added to "append" rel. But I don't think doing all that is in the
scope of this patch set.

Yeah, I thought about all of that and came to similar conclusions.

0004
+        if (!op->all)
+            ppath = make_union_unique(op, ppath, tlist, root);
We could probably push the grouping/sorting down to the parallel workers. But
again not part of this patchset.

Yep. There's a lot more work that could be done to improve setop
planning, but I think getting even this much done for v11 would be a
significant step forward.

Updated patches attached.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company<div class="gmail_extra"><br><div
class="gmail_quote">On Thu, Mar 1, 2018 at 8:30 AM, Ashutosh Bapat
<span dir="ltr">&lt;<a href="mailto:ashutosh.bapat@enterprisedb.com"
target="_blank">ashutosh.bapat@enterprisedb.com</a>&gt;</span>
wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On
Sat, Feb 24, 2018 at 2:55 AM, Robert Haas &lt;<a
href="mailto:robertmhaas@gmail.com">robertmhaas@gmail.com</a>&gt;
wrote:<br>
<br>
&gt;<br>
</span><span class="">&gt; Here's an extended series of patches that
now handles both the simple<br>
&gt; UNION ALL case (where we flatten it) and the unflattened case:<br>
&gt;<br>
<br>
</span>The patches look clean. I particularly looked at 0003.<br>
<br>
patch 0001<br>
+&nbsp; &nbsp; /*<br>
+&nbsp; &nbsp; &nbsp;* Generate partial paths for final_rel, too, if
outer query levels might<br>
+&nbsp; &nbsp; &nbsp;* be able to make use of them.<br>
+&nbsp; &nbsp; &nbsp;*/<br>
I am not able to understand the construct esp. the if clause. Did you
want to<br>
say "... if there are outer query levels. Those might ..." or something like<br>
that?<br>
<br>
0002<br>
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; (op-&gt;all ==
top_union-&gt;all || op-&gt;all) &amp;&amp;<br>
This isn't really your change.&nbsp; Checking<br>
op-&gt;all is cheaper than checking equality, so may be we should
check that first<br>
and take advantage of short-circuit condition evaluation. If we do
that above<br>
condition reduces to (op-&gt;all || !top_union-&gt;all) which is two boolean<br>
conditions, even cheaper. But may be the second optimization is not
worth the<br>
loss of readability.<br>
<br>
"identically-propertied UNIONs" may be "UNIONs with identical properties".<br>
<br>
0003<br>
Probably we want to rename generate_union_path() as generate_union_rel() or<br>
generate_union_paths() since the function doesn't return a path anymore.<br>
Similarly for generate_nonunion_path().<br>
<br>
In recurse_set_operations()<br>
-&nbsp; &nbsp; &nbsp; &nbsp; return NULL;&nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; /* keep compiler quiet */<br>
This line is deleted and instead rel is initialized to NULL. That way
we loose<br>
any chance to detect a future bug because of a block leaving rel
uninitialized<br>
through compiler warning. May be we should replace "return NULL" with "rel =<br>
NULL", which will not be executed because of the error.<br>
<br>
+&nbsp; &nbsp; /* Build path list and relid set. */<br>
+&nbsp; &nbsp; foreach(lc, rellist)<br>
+&nbsp; &nbsp; {<br>
With the changes in this patch, we could actually use
add_paths_to_append_rel()<br>
to create an append path. That function builds paths with different
pathkeys,<br>
parameterization (doesn't matter here) and also handles parallel
append. So we<br>
can avoid code duplication and also leverage more optimizations like using<br>
MergeAppend instead of overall sort etc. But that function doesn't
have ability<br>
to add a final node like make_union_unique(). A similar requirement
has arisen<br>
in partition-wise join where we need to add a final node for finalising<br>
aggregate on top of paths created by add_paths_to_append_rel().&nbsp;
May be we can<br>
change that function to return a list of paths, which are then
finalized by the<br>
caller and added to "append" rel. But I don't think doing all that is in the<br>
scope of this patch set.<br>
<br>
0004<br>
+&nbsp; &nbsp; &nbsp; &nbsp; if (!op-&gt;all)<br>
+&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ppath =
make_union_unique(op, ppath, tlist, root);<br>
We could probably push the grouping/sorting down to the parallel
workers. But<br>
again not part of this patchset.<br>
<span class="HOEnZb"><font color="#888888"><br>
--<br>
Best Wishes,<br>
Ashutosh Bapat<br>
</font></span><div class="HOEnZb"><div class="h5">EnterpriseDB Corporation<br>
The Postgres Database Company<br>
</div></div></blockquote></div><br><br clear="all"><div><br></div>--
<br><div class="gmail_signature"
data-smartmail="gmail_signature">Robert Haas<br>EnterpriseDB: <a
href="http://www.enterprisedb.com"
target="_blank">http://www.enterprisedb.com</a><br>The Enterprise
PostgreSQL Company</div>
</div>

Attachments:

0004-Consider-Parallel-Append-as-a-way-to-implement-a-uni.patchapplication/octet-stream; name=0004-Consider-Parallel-Append-as-a-way-to-implement-a-uni.patchDownload

From cf37a7afde5ff50bd950e2e172c0e9ddf730ee13 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Sat, 23 Dec 2017 10:56:07 -0800
Subject: [PATCH 4/4] Consider Parallel Append as a way to implement a union
 operation in a setop tree.

Patch by me, reviewed by Ashutosh Bapat.
---
 src/backend/optimizer/prep/prepunion.c | 93 +++++++++++++++++++++++++++++++++-
 1 file changed, 91 insertions(+), 2 deletions(-)

diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index e477a04f94..b80b9a95bc 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -298,12 +298,18 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		 */
 		set_subquery_size_estimates(root, rel);
 
+		/*
+		 * Since we may want to add a partial path to this relation, we must
+		 * set its consider_parallel flag correctly.
+		 */
+		final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
+		rel->consider_parallel = final_rel->consider_parallel;
+
 		/*
 		 * For the moment, we consider only a single Path for the subquery.
 		 * This should change soon (make it look more like
 		 * set_subquery_pathlist).
 		 */
-		final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
 		subpath = get_cheapest_fractional_path(final_rel,
 											   root->tuple_fraction);
 
@@ -320,6 +326,23 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 
 		add_path(rel, path);
 
+		/*
+		 * If we have a partial path for the child relation, we can use that
+		 * to build a partial path for this relation.  But there's no point in
+		 * considering any path but the cheapest.
+		 */
+		if (final_rel->partial_pathlist != NIL)
+		{
+			Path	   *partial_subpath;
+			Path	   *partial_path;
+
+			partial_subpath = linitial(final_rel->partial_pathlist);
+			partial_path = (Path *)
+				create_subqueryscan_path(root, rel, partial_subpath,
+										 NIL, NULL);
+			add_partial_path(rel, partial_path);
+		}
+
 		/*
 		 * Estimate number of groups if caller wants it.  If the subquery used
 		 * grouping or aggregation, its output is probably mostly unique
@@ -552,6 +575,9 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	double		save_fraction = root->tuple_fraction;
 	ListCell   *lc;
 	List	   *pathlist = NIL;
+	List	   *partial_pathlist = NIL;
+	bool		partial_paths_valid = true;
+	bool		consider_parallel = true;
 	List	   *rellist;
 	List	   *tlist_list;
 	List	   *tlist;
@@ -591,18 +617,34 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 
 	*pTargetList = tlist;
 
-	/* Build path list and relid set. */
+	/* Build path lists and relid set. */
 	foreach(lc, rellist)
 	{
 		RelOptInfo *rel = lfirst(lc);
 
 		pathlist = lappend(pathlist, rel->cheapest_total_path);
+
+		if (consider_parallel)
+		{
+			if (!rel->consider_parallel)
+			{
+				consider_parallel = false;
+				partial_paths_valid = false;
+			}
+			else if (rel->partial_pathlist == NIL)
+				partial_paths_valid = false;
+			else
+				partial_pathlist = lappend(partial_pathlist,
+										   linitial(rel->partial_pathlist));
+		}
+
 		relids = bms_union(relids, rel->relids);
 	}
 
 	/* Build result relation. */
 	result_rel = fetch_upper_rel(root, UPPERREL_SETOP, relids);
 	result_rel->reltarget = create_pathtarget(root, tlist);
+	result_rel->consider_parallel = consider_parallel;
 
 	/*
 	 * Append the child results together.
@@ -626,6 +668,53 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	 */
 	result_rel->rows = path->rows;
 
+	/*
+	 * Now consider doing the same thing using the partial paths plus Append
+	 * plus Gather.
+	 */
+	if (partial_paths_valid)
+	{
+		Path	   *ppath;
+		ListCell   *lc;
+		int			parallel_workers = 0;
+
+		/* Find the highest number of workers requested for any subpath. */
+		foreach(lc, partial_pathlist)
+		{
+			Path	   *path = lfirst(lc);
+
+			parallel_workers = Max(parallel_workers, path->parallel_workers);
+		}
+		Assert(parallel_workers > 0);
+
+		/*
+		 * If the use of parallel append is permitted, always request at least
+		 * log2(# of children) paths.  We assume it can be useful to have
+		 * extra workers in this case because they will be spread out across
+		 * the children.  The precise formula is just a guess; see
+		 * add_paths_to_append_rel.
+		 */
+		if (enable_parallel_append)
+		{
+			parallel_workers = Max(parallel_workers,
+								   fls(list_length(partial_pathlist)));
+			parallel_workers = Min(parallel_workers,
+								   max_parallel_workers_per_gather);
+		}
+		Assert(parallel_workers > 0);
+
+		ppath = (Path *)
+			create_append_path(result_rel, NIL, partial_pathlist,
+							   NULL, parallel_workers, enable_parallel_append,
+							   NIL /* XXX? Is this right? */ , -1);
+		ppath = (Path *)
+			create_gather_path(root, result_rel, ppath,
+							   result_rel->reltarget, NULL, NULL);
+		if (!op->all)
+			ppath = make_union_unique(op, ppath, tlist, root);
+		add_path(result_rel, ppath);
+	}
+
 	/* Undo effects of possibly forcing tuple_fraction to 0 */
 	root->tuple_fraction = save_fraction;
 
-- 
2.14.3 (Apple Git-98)

0003-Generate-a-separate-upper-relation-for-each-stage-of.patchapplication/octet-stream; name=0003-Generate-a-separate-upper-relation-for-each-stage-of.patchDownload

From 0dd95f93a9852882aa792424705acb8f5d22f579 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Fri, 23 Feb 2018 15:42:51 -0500
Subject: [PATCH 3/4] Generate a separate upper relation for each stage of
 setop planning.

Commit 3fc6e2d7f5b652b417fa6937c34de2438d60fa9f made setop planning
stages return paths rather than plans, but all such paths were loosely
associated with a single RelOptInfo, and only the final path was added
to the RelOptInfo.  Even at the time, it was foreseen that this should
be changed, because there is otherwise no good way for a single stage
of setop planning to return multiple paths.  With this patch, each
stage of set operation planning now creates a separate RelOptInfo;
these are distinguished by using appropriate relid sets.  Note that
this patch does nothing whatsoever about actually returning multiple
paths for the same set operation; it just makes it possible for a
future patch to do so.

Along the way, adjust things so that create_upper_paths_hook is called
for each of these new RelOptInfos rather than just once, since that
might be useful to extensions using that hook.  It might be a good
to provide an FDW API here as well, but I didn't try to do that for
now.

Patch by me, reviewed by Ashutosh Bapat.
---
 src/backend/optimizer/prep/prepunion.c | 342 ++++++++++++++++++---------------
 1 file changed, 191 insertions(+), 151 deletions(-)

diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index f387387289..e477a04f94 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -60,30 +60,29 @@ typedef struct
 	AppendRelInfo **appinfos;
 } adjust_appendrel_attrs_context;
 
-static Path *recurse_set_operations(Node *setOp, PlannerInfo *root,
+static RelOptInfo *recurse_set_operations(Node *setOp, PlannerInfo *root,
 					   List *colTypes, List *colCollations,
 					   bool junkOK,
 					   int flag, List *refnames_tlist,
 					   List **pTargetList,
 					   double *pNumGroups);
-static Path *generate_recursion_path(SetOperationStmt *setOp,
+static RelOptInfo *generate_recursion_path(SetOperationStmt *setOp,
 						PlannerInfo *root,
 						List *refnames_tlist,
 						List **pTargetList);
-static Path *generate_union_path(SetOperationStmt *op, PlannerInfo *root,
-					List *refnames_tlist,
-					List **pTargetList,
-					double *pNumGroups);
-static Path *generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
-					   List *refnames_tlist,
-					   List **pTargetList,
-					   double *pNumGroups);
+static RelOptInfo *generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
+					 List *refnames_tlist,
+					 List **pTargetList);
+static RelOptInfo *generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *root,
+						List *refnames_tlist,
+						List **pTargetList);
 static List *plan_union_children(PlannerInfo *root,
 					SetOperationStmt *top_union,
 					List *refnames_tlist,
 					List **tlist_list);
 static Path *make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
 				  PlannerInfo *root);
+static void postprocess_setop_rel(PlannerInfo *root, RelOptInfo *rel);
 static bool choose_hashed_setop(PlannerInfo *root, List *groupClauses,
 					Path *input_path,
 					double dNumGroups, double dNumOutputRows,
@@ -149,7 +148,6 @@ plan_set_operations(PlannerInfo *root)
 	RangeTblEntry *leftmostRTE;
 	Query	   *leftmostQuery;
 	RelOptInfo *setop_rel;
-	Path	   *path;
 	List	   *top_tlist;
 
 	Assert(topop);
@@ -181,57 +179,34 @@ plan_set_operations(PlannerInfo *root)
 	leftmostQuery = leftmostRTE->subquery;
 	Assert(leftmostQuery != NULL);
 
-	/*
-	 * We return our results in the (SETOP, NULL) upperrel.  For the moment,
-	 * this is also the parent rel of all Paths in the setop tree; we may well
-	 * change that in future.
-	 */
-	setop_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
-
-	/*
-	 * We don't currently worry about setting setop_rel's consider_parallel
-	 * flag, nor about allowing FDWs to contribute paths to it.
-	 */
-
 	/*
 	 * If the topmost node is a recursive union, it needs special processing.
 	 */
 	if (root->hasRecursion)
 	{
-		path = generate_recursion_path(topop, root,
-									   leftmostQuery->targetList,
-									   &top_tlist);
+		setop_rel = generate_recursion_path(topop, root,
+											leftmostQuery->targetList,
+											&top_tlist);
 	}
 	else
 	{
 		/*
 		 * Recurse on setOperations tree to generate paths for set ops. The
-		 * final output path should have just the column types shown as the
+		 * final output paths should have just the column types shown as the
 		 * output from the top-level node, plus possibly resjunk working
 		 * columns (we can rely on upper-level nodes to deal with that).
 		 */
-		path = recurse_set_operations((Node *) topop, root,
-									  topop->colTypes, topop->colCollations,
-									  true, -1,
-									  leftmostQuery->targetList,
-									  &top_tlist,
-									  NULL);
+		setop_rel = recurse_set_operations((Node *) topop, root,
+										   topop->colTypes, topop->colCollations,
+										   true, -1,
+										   leftmostQuery->targetList,
+										   &top_tlist,
+										   NULL);
 	}
 
 	/* Must return the built tlist into root->processed_tlist. */
 	root->processed_tlist = top_tlist;
 
-	/* Add only the final path to the SETOP upperrel. */
-	add_path(setop_rel, path);
-
-	/* Let extensions possibly add some more paths */
-	if (create_upper_paths_hook)
-		(*create_upper_paths_hook) (root, UPPERREL_SETOP,
-									NULL, setop_rel);
-
-	/* Select cheapest path */
-	set_cheapest(setop_rel);
-
 	return setop_rel;
 }
 
@@ -245,21 +220,21 @@ plan_set_operations(PlannerInfo *root)
  * flag: if >= 0, add a resjunk output column indicating value of flag
  * refnames_tlist: targetlist to take column names from
  *
- * Returns a path for the subtree, as well as these output parameters:
+ * Returns a RelOptInfo for the subtree, as well as these output parameters:
  * *pTargetList: receives the fully-fledged tlist for the subtree's top plan
  * *pNumGroups: if not NULL, we estimate the number of distinct groups
  *		in the result, and store it there
  *
  * The pTargetList output parameter is mostly redundant with the pathtarget
- * of the returned path, but for the moment we need it because much of the
- * logic in this file depends on flag columns being marked resjunk.  Pending
- * a redesign of how that works, this is the easy way out.
+ * of the returned RelOptInfo, but for the moment we need it because much of
+ * the logic in this file depends on flag columns being marked resjunk.
+ * Pending a redesign of how that works, this is the easy way out.
  *
  * We don't have to care about typmods here: the only allowed difference
  * between set-op input and output typmods is input is a specific typmod
  * and output is -1, and that does not require a coercion.
  */
-static Path *
+static RelOptInfo *
 recurse_set_operations(Node *setOp, PlannerInfo *root,
 					   List *colTypes, List *colCollations,
 					   bool junkOK,
@@ -267,6 +242,8 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 					   List **pTargetList,
 					   double *pNumGroups)
 {
+	RelOptInfo *rel = NULL;		/* keep compiler quiet */
+
 	/* Guard against stack overflow due to overly complex setop nests */
 	check_stack_depth();
 
@@ -275,7 +252,6 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		RangeTblRef *rtr = (RangeTblRef *) setOp;
 		RangeTblEntry *rte = root->simple_rte_array[rtr->rtindex];
 		Query	   *subquery = rte->subquery;
-		RelOptInfo *rel;
 		PlannerInfo *subroot;
 		RelOptInfo *final_rel;
 		Path	   *subpath;
@@ -284,11 +260,7 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 
 		Assert(subquery != NULL);
 
-		/*
-		 * We need to build a RelOptInfo for each leaf subquery.  This isn't
-		 * used for much here, but it carries the subroot data structures
-		 * forward to setrefs.c processing.
-		 */
+		/* Build a RelOptInfo for this leaf subquery. */
 		rel = build_simple_rel(root, rtr->rtindex, NULL);
 
 		/* plan_params should not be in use in current query level */
@@ -307,6 +279,18 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		if (root->plan_params)
 			elog(ERROR, "unexpected outer reference in set operation subquery");
 
+		/* Figure out the appropriate target list for this subquery. */
+		tlist = generate_setop_tlist(colTypes, colCollations,
+									 flag,
+									 rtr->rtindex,
+									 true,
+									 subroot->processed_tlist,
+									 refnames_tlist);
+		rel->reltarget = create_pathtarget(root, tlist);
+
+		/* Return the fully-fledged tlist to caller, too */
+		*pTargetList = tlist;
+
 		/*
 		 * Mark rel with estimated output rows, width, etc.  Note that we have
 		 * to do this before generating outer-query paths, else
@@ -334,22 +318,7 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		path = (Path *) create_subqueryscan_path(root, rel, subpath,
 												 NIL, NULL);
 
-		/*
-		 * Figure out the appropriate target list, and update the
-		 * SubqueryScanPath with the PathTarget form of that.
-		 */
-		tlist = generate_setop_tlist(colTypes, colCollations,
-									 flag,
-									 rtr->rtindex,
-									 true,
-									 subroot->processed_tlist,
-									 refnames_tlist);
-
-		path = apply_projection_to_path(root, rel, path,
-										create_pathtarget(root, tlist));
-
-		/* Return the fully-fledged tlist to caller, too */
-		*pTargetList = tlist;
+		add_path(rel, path);
 
 		/*
 		 * Estimate number of groups if caller wants it.  If the subquery used
@@ -378,25 +347,22 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 												  subpath->rows,
 												  NULL);
 		}
-
-		return (Path *) path;
 	}
 	else if (IsA(setOp, SetOperationStmt))
 	{
 		SetOperationStmt *op = (SetOperationStmt *) setOp;
-		Path	   *path;
 
 		/* UNIONs are much different from INTERSECT/EXCEPT */
 		if (op->op == SETOP_UNION)
-			path = generate_union_path(op, root,
-									   refnames_tlist,
-									   pTargetList,
-									   pNumGroups);
+			rel = generate_union_path(op, root,
+									  refnames_tlist,
+									  pTargetList);
 		else
-			path = generate_nonunion_path(op, root,
-										  refnames_tlist,
-										  pTargetList,
-										  pNumGroups);
+			rel = generate_nonunion_path(op, root,
+										 refnames_tlist,
+										 pTargetList);
+		if (pNumGroups)
+			*pNumGroups = rel->rows;
 
 		/*
 		 * If necessary, add a Result node to project the caller-requested
@@ -415,39 +381,70 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 			!tlist_same_datatypes(*pTargetList, colTypes, junkOK) ||
 			!tlist_same_collations(*pTargetList, colCollations, junkOK))
 		{
+			PathTarget *target;
+			ListCell   *lc;
+
 			*pTargetList = generate_setop_tlist(colTypes, colCollations,
 												flag,
 												0,
 												false,
 												*pTargetList,
 												refnames_tlist);
-			path = apply_projection_to_path(root,
-											path->parent,
-											path,
-											create_pathtarget(root,
-															  *pTargetList));
+			target = create_pathtarget(root, *pTargetList);
+
+			/* Apply projection to each path */
+			foreach(lc, rel->pathlist)
+			{
+				Path	   *subpath = (Path *) lfirst(lc);
+				Path	   *path;
+
+				Assert(subpath->param_info == NULL);
+				path = apply_projection_to_path(root, subpath->parent,
+												subpath, target);
+				/* If we had to add a Result, path is different from subpath */
+				if (path != subpath)
+					lfirst(lc) = path;
+			}
+
+			/* Apply projection to each partial path */
+			foreach(lc, rel->partial_pathlist)
+			{
+				Path	   *subpath = (Path *) lfirst(lc);
+				Path	   *path;
+
+				Assert(subpath->param_info == NULL);
+
+				/* avoid apply_projection_to_path, in case of multiple refs */
+				path = (Path *) create_projection_path(root, subpath->parent,
+													   subpath, target);
+				lfirst(lc) = path;
+			}
 		}
-		return path;
 	}
 	else
 	{
 		elog(ERROR, "unrecognized node type: %d",
 			 (int) nodeTag(setOp));
 		*pTargetList = NIL;
-		return NULL;			/* keep compiler quiet */
 	}
+
+	postprocess_setop_rel(root, rel);
+
+	return rel;
 }
 
 /*
- * Generate path for a recursive UNION node
+ * Generate paths for a recursive UNION node
  */
-static Path *
+static RelOptInfo *
 generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 						List *refnames_tlist,
 						List **pTargetList)
 {
-	RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
+	RelOptInfo *result_rel;
 	Path	   *path;
+	RelOptInfo *lrel,
+			   *rrel;
 	Path	   *lpath;
 	Path	   *rpath;
 	List	   *lpath_tlist;
@@ -466,20 +463,22 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 	 * Unlike a regular UNION node, process the left and right inputs
 	 * separately without any intention of combining them into one Append.
 	 */
-	lpath = recurse_set_operations(setOp->larg, root,
-								   setOp->colTypes, setOp->colCollations,
-								   false, -1,
-								   refnames_tlist,
-								   &lpath_tlist,
-								   NULL);
+	lrel = recurse_set_operations(setOp->larg, root,
+								  setOp->colTypes, setOp->colCollations,
+								  false, -1,
+								  refnames_tlist,
+								  &lpath_tlist,
+								  NULL);
+	lpath = lrel->cheapest_total_path;
 	/* The right path will want to look at the left one ... */
 	root->non_recursive_path = lpath;
-	rpath = recurse_set_operations(setOp->rarg, root,
-								   setOp->colTypes, setOp->colCollations,
-								   false, -1,
-								   refnames_tlist,
-								   &rpath_tlist,
-								   NULL);
+	rrel = recurse_set_operations(setOp->rarg, root,
+								  setOp->colTypes, setOp->colCollations,
+								  false, -1,
+								  refnames_tlist,
+								  &rpath_tlist,
+								  NULL);
+	rpath = rrel->cheapest_total_path;
 	root->non_recursive_path = NULL;
 
 	/*
@@ -491,6 +490,11 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 
 	*pTargetList = tlist;
 
+	/* Build result relation. */
+	result_rel = fetch_upper_rel(root, UPPERREL_SETOP,
+								 bms_union(lrel->relids, rrel->relids));
+	result_rel->reltarget = create_pathtarget(root, tlist);
+
 	/*
 	 * If UNION, identify the grouping operators
 	 */
@@ -525,26 +529,30 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 											   result_rel,
 											   lpath,
 											   rpath,
-											   create_pathtarget(root, tlist),
+											   result_rel->reltarget,
 											   groupList,
 											   root->wt_param_id,
 											   dNumGroups);
 
-	return path;
+	add_path(result_rel, path);
+	postprocess_setop_rel(root, result_rel);
+	return result_rel;
 }
 
 /*
- * Generate path for a UNION or UNION ALL node
+ * Generate paths for a UNION or UNION ALL node
  */
-static Path *
-generate_union_path(SetOperationStmt *op, PlannerInfo *root,
-					List *refnames_tlist,
-					List **pTargetList,
-					double *pNumGroups)
+static RelOptInfo *
+generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
+					 List *refnames_tlist,
+					 List **pTargetList)
 {
-	RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
+	Relids		relids = NULL;
+	RelOptInfo *result_rel;
 	double		save_fraction = root->tuple_fraction;
-	List	   *pathlist;
+	ListCell   *lc;
+	List	   *pathlist = NIL;
+	List	   *rellist;
 	List	   *tlist_list;
 	List	   *tlist;
 	Path	   *path;
@@ -569,7 +577,7 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 	 * only one Append and unique-ification for the lot.  Recurse to find such
 	 * nodes and compute their children's paths.
 	 */
-	pathlist = plan_union_children(root, op, refnames_tlist, &tlist_list);
+	rellist = plan_union_children(root, op, refnames_tlist, &tlist_list);
 
 	/*
 	 * Generate tlist for Append plan node.
@@ -583,13 +591,24 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 
 	*pTargetList = tlist;
 
+	/* Build path list and relid set. */
+	foreach(lc, rellist)
+	{
+		RelOptInfo *rel = lfirst(lc);
+
+		pathlist = lappend(pathlist, rel->cheapest_total_path);
+		relids = bms_union(relids, rel->relids);
+	}
+
+	/* Build result relation. */
+	result_rel = fetch_upper_rel(root, UPPERREL_SETOP, relids);
+	result_rel->reltarget = create_pathtarget(root, tlist);
+
 	/*
 	 * Append the child results together.
 	 */
 	path = (Path *) create_append_path(result_rel, pathlist, NIL,
 									   NULL, 0, false, NIL, -1);
-	/* We have to manually jam the right tlist into the path; ick */
-	path->pathtarget = create_pathtarget(root, tlist);
 
 	/*
 	 * For UNION ALL, we just need the Append path.  For UNION, need to add
@@ -598,30 +617,32 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 	if (!op->all)
 		path = make_union_unique(op, path, tlist, root);
 
+	add_path(result_rel, path);
+
 	/*
-	 * Estimate number of groups if caller wants it.  For now we just assume
-	 * the output is unique --- this is certainly true for the UNION case, and
-	 * we want worst-case estimates anyway.
+	 * Estimate number of groups.  For now we just assume the output is unique
+	 * --- this is certainly true for the UNION case, and we want worst-case
+	 * estimates anyway.
 	 */
-	if (pNumGroups)
-		*pNumGroups = path->rows;
+	result_rel->rows = path->rows;
 
 	/* Undo effects of possibly forcing tuple_fraction to 0 */
 	root->tuple_fraction = save_fraction;
 
-	return path;
+	return result_rel;
 }
 
 /*
- * Generate path for an INTERSECT, INTERSECT ALL, EXCEPT, or EXCEPT ALL node
+ * Generate paths for an INTERSECT, INTERSECT ALL, EXCEPT, or EXCEPT ALL node
  */
-static Path *
-generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
-					   List *refnames_tlist,
-					   List **pTargetList,
-					   double *pNumGroups)
+static RelOptInfo *
+generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *root,
+						List *refnames_tlist,
+						List **pTargetList)
 {
-	RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
+	RelOptInfo *result_rel;
+	RelOptInfo *lrel,
+			   *rrel;
 	double		save_fraction = root->tuple_fraction;
 	Path	   *lpath,
 			   *rpath,
@@ -646,18 +667,20 @@ generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
 	root->tuple_fraction = 0.0;
 
 	/* Recurse on children, ensuring their outputs are marked */
-	lpath = recurse_set_operations(op->larg, root,
-								   op->colTypes, op->colCollations,
-								   false, 0,
-								   refnames_tlist,
-								   &lpath_tlist,
-								   &dLeftGroups);
-	rpath = recurse_set_operations(op->rarg, root,
-								   op->colTypes, op->colCollations,
-								   false, 1,
-								   refnames_tlist,
-								   &rpath_tlist,
-								   &dRightGroups);
+	lrel = recurse_set_operations(op->larg, root,
+								  op->colTypes, op->colCollations,
+								  false, 0,
+								  refnames_tlist,
+								  &lpath_tlist,
+								  &dLeftGroups);
+	lpath = lrel->cheapest_total_path;
+	rrel = recurse_set_operations(op->rarg, root,
+								  op->colTypes, op->colCollations,
+								  false, 1,
+								  refnames_tlist,
+								  &rpath_tlist,
+								  &dRightGroups);
+	rpath = rrel->cheapest_total_path;
 
 	/* Undo effects of forcing tuple_fraction to 0 */
 	root->tuple_fraction = save_fraction;
@@ -695,15 +718,17 @@ generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
 
 	*pTargetList = tlist;
 
+	/* Build result relation. */
+	result_rel = fetch_upper_rel(root, UPPERREL_SETOP,
+								 bms_union(lrel->relids, rrel->relids));
+	result_rel->reltarget = create_pathtarget(root, tlist);;
+
 	/*
 	 * Append the child results together.
 	 */
 	path = (Path *) create_append_path(result_rel, pathlist, NIL,
 									   NULL, 0, false, NIL, -1);
 
-	/* We have to manually jam the right tlist into the path; ick */
-	path->pathtarget = create_pathtarget(root, tlist);
-
 	/* Identify the grouping semantics */
 	groupList = generate_setop_grouplist(op, tlist);
 
@@ -769,10 +794,9 @@ generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
 									  dNumGroups,
 									  dNumOutputRows);
 
-	if (pNumGroups)
-		*pNumGroups = dNumGroups;
-
-	return path;
+	result_rel->rows = path->rows;
+	add_path(result_rel, path);
+	return result_rel;
 }
 
 /*
@@ -897,8 +921,6 @@ make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
 															   groupList,
 															   tlist),
 								 -1.0);
-		/* We have to manually jam the right tlist into the path; ick */
-		path->pathtarget = create_pathtarget(root, tlist);
 		path = (Path *) create_upper_unique_path(root,
 												 result_rel,
 												 path,
@@ -909,6 +931,24 @@ make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
 	return path;
 }
 
+/*
+ * postprocess_setop_rel - perform steps required after adding paths
+ */
+static void
+postprocess_setop_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	/*
+	 * We don't currently worry about allowing FDWs to contribute paths to
+	 * this relation, but give extensions a chance.
+	 */
+	if (create_upper_paths_hook)
+		(*create_upper_paths_hook) (root, UPPERREL_SETOP,
+									NULL, rel);
+
+	/* Select cheapest path */
+	set_cheapest(rel);
+}
+
 /*
  * choose_hashed_setop - should we use hashing for a set operation?
  */
-- 
2.14.3 (Apple Git-98)

0002-Rewrite-recurse_union_children-to-iterate-rather-tha.patchapplication/octet-stream; name=0002-Rewrite-recurse_union_children-to-iterate-rather-tha.patchDownload

From 3604a7d0b00f65dc670e42fa0d7fc36633ef15ea Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Fri, 23 Feb 2018 12:13:46 -0500
Subject: [PATCH 2/4] Rewrite recurse_union_children to iterate, rather than
 recurse.

Also, rename it to plan_union_chidren, so the old name wasn't
very descriptive.  This results in a small net reduction in code,
seems at least to me to be easier to understand, and saves
space on the process stack.

Patch by me, reviewed by Ashutosh Bapat.
---
 src/backend/optimizer/prep/prepunion.c | 100 ++++++++++++++++-----------------
 1 file changed, 47 insertions(+), 53 deletions(-)

diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index b586f941a8..f387387289 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -78,10 +78,10 @@ static Path *generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
 					   List *refnames_tlist,
 					   List **pTargetList,
 					   double *pNumGroups);
-static List *recurse_union_children(Node *setOp, PlannerInfo *root,
-					   SetOperationStmt *top_union,
-					   List *refnames_tlist,
-					   List **tlist_list);
+static List *plan_union_children(PlannerInfo *root,
+					SetOperationStmt *top_union,
+					List *refnames_tlist,
+					List **tlist_list);
 static Path *make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
 				  PlannerInfo *root);
 static bool choose_hashed_setop(PlannerInfo *root, List *groupClauses,
@@ -545,8 +545,6 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 	RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
 	double		save_fraction = root->tuple_fraction;
 	List	   *pathlist;
-	List	   *child_tlists1;
-	List	   *child_tlists2;
 	List	   *tlist_list;
 	List	   *tlist;
 	Path	   *path;
@@ -571,13 +569,7 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 	 * only one Append and unique-ification for the lot.  Recurse to find such
 	 * nodes and compute their children's paths.
 	 */
-	pathlist = list_concat(recurse_union_children(op->larg, root,
-												  op, refnames_tlist,
-												  &child_tlists1),
-						   recurse_union_children(op->rarg, root,
-												  op, refnames_tlist,
-												  &child_tlists2));
-	tlist_list = list_concat(child_tlists1, child_tlists2);
+	pathlist = plan_union_children(root, op, refnames_tlist, &tlist_list);
 
 	/*
 	 * Generate tlist for Append plan node.
@@ -797,56 +789,58 @@ generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
  * generate_union_path will force a fresh sort if the top level is a UNION.
  */
 static List *
-recurse_union_children(Node *setOp, PlannerInfo *root,
-					   SetOperationStmt *top_union,
-					   List *refnames_tlist,
-					   List **tlist_list)
+plan_union_children(PlannerInfo *root,
+					SetOperationStmt *top_union,
+					List *refnames_tlist,
+					List **tlist_list)
 {
-	List	   *result;
+	List	   *pending_rels = list_make1(top_union);
+	List	   *result = NIL;
 	List	   *child_tlist;
 
-	if (IsA(setOp, SetOperationStmt))
+	*tlist_list = NIL;
+
+	while (pending_rels != NIL)
 	{
-		SetOperationStmt *op = (SetOperationStmt *) setOp;
+		Node	   *setOp = linitial(pending_rels);
+
+		pending_rels = list_delete_first(pending_rels);
 
-		if (op->op == top_union->op &&
-			(op->all == top_union->all || op->all) &&
-			equal(op->colTypes, top_union->colTypes))
+		if (IsA(setOp, SetOperationStmt))
 		{
-			/* Same UNION, so fold children into parent's subpath list */
-			List	   *child_tlists1;
-			List	   *child_tlists2;
+			SetOperationStmt *op = (SetOperationStmt *) setOp;
 
-			result = list_concat(recurse_union_children(op->larg, root,
-														top_union,
-														refnames_tlist,
-														&child_tlists1),
-								 recurse_union_children(op->rarg, root,
-														top_union,
-														refnames_tlist,
-														&child_tlists2));
-			*tlist_list = list_concat(child_tlists1, child_tlists2);
-			return result;
+			if (op->op == top_union->op &&
+				(op->all == top_union->all || op->all) &&
+				equal(op->colTypes, top_union->colTypes))
+			{
+				/* Same UNION, so fold children into parent */
+				pending_rels = lcons(op->rarg, pending_rels);
+				pending_rels = lcons(op->larg, pending_rels);
+				continue;
+			}
 		}
+
+		/*
+		 * Not same, so plan this child separately.
+		 *
+		 * Note we disallow any resjunk columns in child results.  This is
+		 * necessary since the Append node that implements the union won't do
+		 * any projection, and upper levels will get confused if some of our
+		 * output tuples have junk and some don't.  This case only arises when
+		 * we have an EXCEPT or INTERSECT as child, else there won't be
+		 * resjunk anyway.
+		 */
+		result = lappend(result, recurse_set_operations(setOp, root,
+														top_union->colTypes,
+														top_union->colCollations,
+														false, -1,
+														refnames_tlist,
+														&child_tlist,
+														NULL));
+		*tlist_list = lappend(*tlist_list, child_tlist);
 	}
 
-	/*
-	 * Not same, so plan this child separately.
-	 *
-	 * Note we disallow any resjunk columns in child results.  This is
-	 * necessary since the Append node that implements the union won't do any
-	 * projection, and upper levels will get confused if some of our output
-	 * tuples have junk and some don't.  This case only arises when we have an
-	 * EXCEPT or INTERSECT as child, else there won't be resjunk anyway.
-	 */
-	result = list_make1(recurse_set_operations(setOp, root,
-											   top_union->colTypes,
-											   top_union->colCollations,
-											   false, -1,
-											   refnames_tlist,
-											   &child_tlist,
-											   NULL));
-	*tlist_list = list_make1(child_tlist);
 	return result;
 }
 
-- 
2.14.3 (Apple Git-98)

0001-Let-Parallel-Append-over-simple-UNION-ALL-have-parti.patchapplication/octet-stream; name=0001-Let-Parallel-Append-over-simple-UNION-ALL-have-parti.patchDownload

From 2c8672d1ac8028894afbf1777072c35305b249e2 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Fri, 23 Feb 2018 11:53:07 -0500
Subject: [PATCH 1/4] Let Parallel Append over simple UNION ALL have partial
 subpaths.

A simple UNION ALL gets flattened into an appendrel of subquery
RTEs, but up until now it's been impossible for the appendrel to use
the partial paths for the subqueries, so we can implement the
appendrel as a Parallel Append but only one with non-partial paths
as children.

There are three separate obstacles to removing that limitation.
First, when planning a subquery, propagate any partial paths to the
final_rel so that they are potentially visible to outer query levels
(but not if they have initPlans attached, because that wouldn't be
safe).  Second, after planning a subquery, propagate any partial paths
for the final_rel to the subquery RTE in the outer query level in the
same way we do for non-partial paths.  Third, teach finalize_plan() to
account for the possibility that the fake parameter we use for rescan
signalling when the plan contains a Gather (Merge) node may be
propagated from an outer query level.

Patch by me, reviewed and tested by Amit Khandekar and Rajkumar
Raghuwanshi.  Test cases based on examples by Rajkumar Raghuwanshi.

Discussion: http://postgr.es/m/CA+Tgmoa6L9A1nNCk3aTDVZLZ4KkHDn1+tm7mFyFvP+uQPS7bAg@mail.gmail.com
---
 src/backend/optimizer/path/allpaths.c         | 22 +++++++++
 src/backend/optimizer/plan/planner.c          | 16 +++++++
 src/backend/optimizer/plan/subselect.c        | 17 ++++++-
 src/test/regress/expected/select_parallel.out | 65 +++++++++++++++++++++++++++
 src/test/regress/sql/select_parallel.sql      | 25 +++++++++++
 5 files changed, 143 insertions(+), 2 deletions(-)

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 1c792a00eb..ea4e683abb 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -2179,6 +2179,28 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 				 create_subqueryscan_path(root, rel, subpath,
 										  pathkeys, required_outer));
 	}
+
+	/* If consider_parallel is false, there should be no partial paths. */
+	Assert(sub_final_rel->consider_parallel ||
+		   sub_final_rel->partial_pathlist == NIL);
+
+	/* Same for partial paths. */
+	foreach(lc, sub_final_rel->partial_pathlist)
+	{
+		Path	   *subpath = (Path *) lfirst(lc);
+		List	   *pathkeys;
+
+		/* Convert subpath's pathkeys to outer representation */
+		pathkeys = convert_subquery_pathkeys(root,
+											 rel,
+											 subpath->pathkeys,
+											 make_tlist_from_pathtarget(subpath->pathtarget));
+
+		/* Generate outer path using this subpath */
+		add_partial_path(rel, (Path *)
+						 create_subqueryscan_path(root, rel, subpath,
+												  pathkeys, required_outer));
+	}
 }
 
 /*
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 14b7becf3e..91b32d4947 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -2194,6 +2194,22 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 		add_path(final_rel, path);
 	}
 
+	/*
+	 * Generate partial paths for final_rel, too, if outer query levels might
+	 * be able to make use of them.
+	 */
+	if (final_rel->consider_parallel && root->query_level > 1 &&
+		!limit_needed(parse))
+	{
+		Assert(!parse->rowMarks && parse->commandType == CMD_SELECT);
+		foreach(lc, current_rel->partial_pathlist)
+		{
+			Path	   *partial_path = (Path *) lfirst(lc);
+
+			add_partial_path(final_rel, partial_path);
+		}
+	}
+
 	/*
 	 * If there is an FDW that's responsible for all baserels of the query,
 	 * let it consider adding ForeignPaths.
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 46367cba63..32bf1fd955 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -2202,6 +2202,13 @@ SS_charge_for_initplans(PlannerInfo *root, RelOptInfo *final_rel)
 		path->parallel_safe = false;
 	}
 
+	/*
+	 * Forget about any partial paths and clear consider_parallel, too;
+	 * they're not usable if we attached an initPlan.
+	 */
+	final_rel->partial_pathlist = NIL;
+	final_rel->consider_parallel = false;
+
 	/* We needn't do set_cheapest() here, caller will do it */
 }
 
@@ -2407,10 +2414,16 @@ finalize_plan(PlannerInfo *root, Plan *plan,
 			{
 				SubqueryScan *sscan = (SubqueryScan *) plan;
 				RelOptInfo *rel;
+				Bitmapset  *subquery_params;
 
-				/* We must run SS_finalize_plan on the subquery */
+				/* We must run finalize_plan on the subquery */
 				rel = find_base_rel(root, sscan->scan.scanrelid);
-				SS_finalize_plan(rel->subroot, sscan->subplan);
+				subquery_params = rel->subroot->outer_params;
+				if (gather_param >= 0)
+					subquery_params = bms_add_member(bms_copy(subquery_params),
+													 gather_param);
+				finalize_plan(rel->subroot, sscan->subplan, gather_param,
+							  subquery_params, NULL);
 
 				/* Now we can add its extParams to the parent's params */
 				context.paramids = bms_add_members(context.paramids,
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 0a78261638..2fb16d1a15 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -890,4 +890,69 @@ select stringu1::int2 from tenk1 where unique1 = 1;
 ERROR:  invalid input syntax for integer: "BAAAAA"
 CONTEXT:  parallel worker
 ROLLBACK TO SAVEPOINT settings;
+-- test interaction with set-returning functions
+SAVEPOINT settings;
+-- multiple subqueries under a single Gather node
+-- must set parallel_setup_cost > 0 to discourage multiple Gather nodes
+SET LOCAL parallel_setup_cost = 10;
+EXPLAIN (COSTS OFF)
+SELECT unique1 FROM tenk1 WHERE fivethous = tenthous + 1
+UNION ALL
+SELECT unique1 FROM tenk1 WHERE fivethous = tenthous + 1;
+                     QUERY PLAN                     
+----------------------------------------------------
+ Gather
+   Workers Planned: 4
+   ->  Parallel Append
+         ->  Parallel Seq Scan on tenk1
+               Filter: (fivethous = (tenthous + 1))
+         ->  Parallel Seq Scan on tenk1 tenk1_1
+               Filter: (fivethous = (tenthous + 1))
+(7 rows)
+
+ROLLBACK TO SAVEPOINT settings;
+-- can't use multiple subqueries under a single Gather node due to initPlans
+EXPLAIN (COSTS OFF)
+SELECT unique1 FROM tenk1 WHERE fivethous =
+	(SELECT unique1 FROM tenk1 WHERE fivethous = 1 LIMIT 1)
+UNION ALL
+SELECT unique1 FROM tenk1 WHERE fivethous =
+	(SELECT unique2 FROM tenk1 WHERE fivethous = 1 LIMIT 1)
+ORDER BY 1;
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Sort
+   Sort Key: tenk1.unique1
+   ->  Append
+         ->  Gather
+               Workers Planned: 4
+               Params Evaluated: $1
+               InitPlan 1 (returns $1)
+                 ->  Limit
+                       ->  Gather
+                             Workers Planned: 4
+                             ->  Parallel Seq Scan on tenk1 tenk1_2
+                                   Filter: (fivethous = 1)
+               ->  Parallel Seq Scan on tenk1
+                     Filter: (fivethous = $1)
+         ->  Gather
+               Workers Planned: 4
+               Params Evaluated: $3
+               InitPlan 2 (returns $3)
+                 ->  Limit
+                       ->  Gather
+                             Workers Planned: 4
+                             ->  Parallel Seq Scan on tenk1 tenk1_3
+                                   Filter: (fivethous = 1)
+               ->  Parallel Seq Scan on tenk1 tenk1_1
+                     Filter: (fivethous = $3)
+(25 rows)
+
+-- test interaction with SRFs
+SELECT * FROM information_schema.foreign_data_wrapper_options
+ORDER BY 1, 2, 3;
+ foreign_data_wrapper_catalog | foreign_data_wrapper_name | option_name | option_value 
+------------------------------+---------------------------+-------------+--------------
+(0 rows)
+
 rollback;
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index fa03aae0c0..ec817f2a4c 100644
--- a/src/test/regress/sql/select_parallel.sql
+++ b/src/test/regress/sql/select_parallel.sql
@@ -358,4 +358,29 @@ SET LOCAL force_parallel_mode = 1;
 select stringu1::int2 from tenk1 where unique1 = 1;
 ROLLBACK TO SAVEPOINT settings;
 
+-- test interaction with set-returning functions
+SAVEPOINT settings;
+
+-- multiple subqueries under a single Gather node
+-- must set parallel_setup_cost > 0 to discourage multiple Gather nodes
+SET LOCAL parallel_setup_cost = 10;
+EXPLAIN (COSTS OFF)
+SELECT unique1 FROM tenk1 WHERE fivethous = tenthous + 1
+UNION ALL
+SELECT unique1 FROM tenk1 WHERE fivethous = tenthous + 1;
+ROLLBACK TO SAVEPOINT settings;
+
+-- can't use multiple subqueries under a single Gather node due to initPlans
+EXPLAIN (COSTS OFF)
+SELECT unique1 FROM tenk1 WHERE fivethous =
+	(SELECT unique1 FROM tenk1 WHERE fivethous = 1 LIMIT 1)
+UNION ALL
+SELECT unique1 FROM tenk1 WHERE fivethous =
+	(SELECT unique2 FROM tenk1 WHERE fivethous = 1 LIMIT 1)
+ORDER BY 1;
+
+-- test interaction with SRFs
+SELECT * FROM information_schema.foreign_data_wrapper_options
+ORDER BY 1, 2, 3;
+
 rollback;
-- 
2.14.3 (Apple Git-98)

#13

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

almost 8 years ago

In reply to: Robert Haas (#12)

4 attachment(s)

Re: parallel append vs. simple UNION ALL

On Fri, Mar 9, 2018 at 1:28 AM, Robert Haas <robertmhaas@gmail.com> wrote:

0003
Probably we want to rename generate_union_path() as generate_union_rel() or
generate_union_paths() since the function doesn't return a path anymore.
Similarly for generate_nonunion_path().

Good point. Changed.

It looks like it was not changed in all the places. make falied. I
have fixed all the instances of these two functions in the attached
patchset (only 0003 changes). Please check.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachments:

0001-Let-Parallel-Append-over-simple-UNION-ALL-have-parti.patchtext/x-patch; charset=US-ASCII; name=0001-Let-Parallel-Append-over-simple-UNION-ALL-have-parti.patchDownload

From 7e654c0ae30d867edea5a1a2ca8f7a17b05ed7c5 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Fri, 23 Feb 2018 11:53:07 -0500
Subject: [PATCH 1/4] Let Parallel Append over simple UNION ALL have partial
 subpaths.

A simple UNION ALL gets flattened into an appendrel of subquery
RTEs, but up until now it's been impossible for the appendrel to use
the partial paths for the subqueries, so we can implement the
appendrel as a Parallel Append but only one with non-partial paths
as children.

There are three separate obstacles to removing that limitation.
First, when planning a subquery, propagate any partial paths to the
final_rel so that they are potentially visible to outer query levels
(but not if they have initPlans attached, because that wouldn't be
safe).  Second, after planning a subquery, propagate any partial paths
for the final_rel to the subquery RTE in the outer query level in the
same way we do for non-partial paths.  Third, teach finalize_plan() to
account for the possibility that the fake parameter we use for rescan
signalling when the plan contains a Gather (Merge) node may be
propagated from an outer query level.

Patch by me, reviewed and tested by Amit Khandekar and Rajkumar
Raghuwanshi.  Test cases based on examples by Rajkumar Raghuwanshi.

Discussion: http://postgr.es/m/CA+Tgmoa6L9A1nNCk3aTDVZLZ4KkHDn1+tm7mFyFvP+uQPS7bAg@mail.gmail.com
---
 src/backend/optimizer/path/allpaths.c         |   22 +++++++++
 src/backend/optimizer/plan/planner.c          |   16 ++++++
 src/backend/optimizer/plan/subselect.c        |   17 ++++++-
 src/test/regress/expected/select_parallel.out |   65 +++++++++++++++++++++++++
 src/test/regress/sql/select_parallel.sql      |   25 ++++++++++
 5 files changed, 143 insertions(+), 2 deletions(-)

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 1c792a0..ea4e683 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -2179,6 +2179,28 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 				 create_subqueryscan_path(root, rel, subpath,
 										  pathkeys, required_outer));
 	}
+
+	/* If consider_parallel is false, there should be no partial paths. */
+	Assert(sub_final_rel->consider_parallel ||
+		   sub_final_rel->partial_pathlist == NIL);
+
+	/* Same for partial paths. */
+	foreach(lc, sub_final_rel->partial_pathlist)
+	{
+		Path	   *subpath = (Path *) lfirst(lc);
+		List	   *pathkeys;
+
+		/* Convert subpath's pathkeys to outer representation */
+		pathkeys = convert_subquery_pathkeys(root,
+											 rel,
+											 subpath->pathkeys,
+											 make_tlist_from_pathtarget(subpath->pathtarget));
+
+		/* Generate outer path using this subpath */
+		add_partial_path(rel, (Path *)
+						 create_subqueryscan_path(root, rel, subpath,
+												  pathkeys, required_outer));
+	}
 }
 
 /*
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 24e6c46..66e7e7b 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -2195,6 +2195,22 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 	}
 
 	/*
+	 * Generate partial paths for final_rel, too, if outer query levels might
+	 * be able to make use of them.
+	 */
+	if (final_rel->consider_parallel && root->query_level > 1 &&
+		!limit_needed(parse))
+	{
+		Assert(!parse->rowMarks && parse->commandType == CMD_SELECT);
+		foreach(lc, current_rel->partial_pathlist)
+		{
+			Path	   *partial_path = (Path *) lfirst(lc);
+
+			add_partial_path(final_rel, partial_path);
+		}
+	}
+
+	/*
 	 * If there is an FDW that's responsible for all baserels of the query,
 	 * let it consider adding ForeignPaths.
 	 */
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index dc86dd5..83008d7 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -2202,6 +2202,13 @@ SS_charge_for_initplans(PlannerInfo *root, RelOptInfo *final_rel)
 		path->parallel_safe = false;
 	}
 
+	/*
+	 * Forget about any partial paths and clear consider_parallel, too;
+	 * they're not usable if we attached an initPlan.
+	 */
+	final_rel->partial_pathlist = NIL;
+	final_rel->consider_parallel = false;
+
 	/* We needn't do set_cheapest() here, caller will do it */
 }
 
@@ -2407,10 +2414,16 @@ finalize_plan(PlannerInfo *root, Plan *plan,
 			{
 				SubqueryScan *sscan = (SubqueryScan *) plan;
 				RelOptInfo *rel;
+				Bitmapset  *subquery_params;
 
-				/* We must run SS_finalize_plan on the subquery */
+				/* We must run finalize_plan on the subquery */
 				rel = find_base_rel(root, sscan->scan.scanrelid);
-				SS_finalize_plan(rel->subroot, sscan->subplan);
+				subquery_params = rel->subroot->outer_params;
+				if (gather_param >= 0)
+					subquery_params = bms_add_member(bms_copy(subquery_params),
+													 gather_param);
+				finalize_plan(rel->subroot, sscan->subplan, gather_param,
+							  subquery_params, NULL);
 
 				/* Now we can add its extParams to the parent's params */
 				context.paramids = bms_add_members(context.paramids,
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 0a78261..2fb16d1 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -890,4 +890,69 @@ select stringu1::int2 from tenk1 where unique1 = 1;
 ERROR:  invalid input syntax for integer: "BAAAAA"
 CONTEXT:  parallel worker
 ROLLBACK TO SAVEPOINT settings;
+-- test interaction with set-returning functions
+SAVEPOINT settings;
+-- multiple subqueries under a single Gather node
+-- must set parallel_setup_cost > 0 to discourage multiple Gather nodes
+SET LOCAL parallel_setup_cost = 10;
+EXPLAIN (COSTS OFF)
+SELECT unique1 FROM tenk1 WHERE fivethous = tenthous + 1
+UNION ALL
+SELECT unique1 FROM tenk1 WHERE fivethous = tenthous + 1;
+                     QUERY PLAN                     
+----------------------------------------------------
+ Gather
+   Workers Planned: 4
+   ->  Parallel Append
+         ->  Parallel Seq Scan on tenk1
+               Filter: (fivethous = (tenthous + 1))
+         ->  Parallel Seq Scan on tenk1 tenk1_1
+               Filter: (fivethous = (tenthous + 1))
+(7 rows)
+
+ROLLBACK TO SAVEPOINT settings;
+-- can't use multiple subqueries under a single Gather node due to initPlans
+EXPLAIN (COSTS OFF)
+SELECT unique1 FROM tenk1 WHERE fivethous =
+	(SELECT unique1 FROM tenk1 WHERE fivethous = 1 LIMIT 1)
+UNION ALL
+SELECT unique1 FROM tenk1 WHERE fivethous =
+	(SELECT unique2 FROM tenk1 WHERE fivethous = 1 LIMIT 1)
+ORDER BY 1;
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Sort
+   Sort Key: tenk1.unique1
+   ->  Append
+         ->  Gather
+               Workers Planned: 4
+               Params Evaluated: $1
+               InitPlan 1 (returns $1)
+                 ->  Limit
+                       ->  Gather
+                             Workers Planned: 4
+                             ->  Parallel Seq Scan on tenk1 tenk1_2
+                                   Filter: (fivethous = 1)
+               ->  Parallel Seq Scan on tenk1
+                     Filter: (fivethous = $1)
+         ->  Gather
+               Workers Planned: 4
+               Params Evaluated: $3
+               InitPlan 2 (returns $3)
+                 ->  Limit
+                       ->  Gather
+                             Workers Planned: 4
+                             ->  Parallel Seq Scan on tenk1 tenk1_3
+                                   Filter: (fivethous = 1)
+               ->  Parallel Seq Scan on tenk1 tenk1_1
+                     Filter: (fivethous = $3)
+(25 rows)
+
+-- test interaction with SRFs
+SELECT * FROM information_schema.foreign_data_wrapper_options
+ORDER BY 1, 2, 3;
+ foreign_data_wrapper_catalog | foreign_data_wrapper_name | option_name | option_value 
+------------------------------+---------------------------+-------------+--------------
+(0 rows)
+
 rollback;
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index fa03aae..ec817f2 100644
--- a/src/test/regress/sql/select_parallel.sql
+++ b/src/test/regress/sql/select_parallel.sql
@@ -358,4 +358,29 @@ SET LOCAL force_parallel_mode = 1;
 select stringu1::int2 from tenk1 where unique1 = 1;
 ROLLBACK TO SAVEPOINT settings;
 
+-- test interaction with set-returning functions
+SAVEPOINT settings;
+
+-- multiple subqueries under a single Gather node
+-- must set parallel_setup_cost > 0 to discourage multiple Gather nodes
+SET LOCAL parallel_setup_cost = 10;
+EXPLAIN (COSTS OFF)
+SELECT unique1 FROM tenk1 WHERE fivethous = tenthous + 1
+UNION ALL
+SELECT unique1 FROM tenk1 WHERE fivethous = tenthous + 1;
+ROLLBACK TO SAVEPOINT settings;
+
+-- can't use multiple subqueries under a single Gather node due to initPlans
+EXPLAIN (COSTS OFF)
+SELECT unique1 FROM tenk1 WHERE fivethous =
+	(SELECT unique1 FROM tenk1 WHERE fivethous = 1 LIMIT 1)
+UNION ALL
+SELECT unique1 FROM tenk1 WHERE fivethous =
+	(SELECT unique2 FROM tenk1 WHERE fivethous = 1 LIMIT 1)
+ORDER BY 1;
+
+-- test interaction with SRFs
+SELECT * FROM information_schema.foreign_data_wrapper_options
+ORDER BY 1, 2, 3;
+
 rollback;
-- 
1.7.9.5

0002-Rewrite-recurse_union_children-to-iterate-rather-tha.patchtext/x-patch; charset=US-ASCII; name=0002-Rewrite-recurse_union_children-to-iterate-rather-tha.patchDownload

From bac26e67c8732f805a1f5682773a8c5ef44f8a8c Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Fri, 23 Feb 2018 12:13:46 -0500
Subject: [PATCH 2/4] Rewrite recurse_union_children to iterate, rather than
 recurse.

Also, rename it to plan_union_chidren, so the old name wasn't
very descriptive.  This results in a small net reduction in code,
seems at least to me to be easier to understand, and saves
space on the process stack.

Patch by me, reviewed by Ashutosh Bapat.
---
 src/backend/optimizer/prep/prepunion.c |  100 +++++++++++++++-----------------
 1 file changed, 47 insertions(+), 53 deletions(-)

diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index b586f94..f387387 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -78,10 +78,10 @@ static Path *generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
 					   List *refnames_tlist,
 					   List **pTargetList,
 					   double *pNumGroups);
-static List *recurse_union_children(Node *setOp, PlannerInfo *root,
-					   SetOperationStmt *top_union,
-					   List *refnames_tlist,
-					   List **tlist_list);
+static List *plan_union_children(PlannerInfo *root,
+					SetOperationStmt *top_union,
+					List *refnames_tlist,
+					List **tlist_list);
 static Path *make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
 				  PlannerInfo *root);
 static bool choose_hashed_setop(PlannerInfo *root, List *groupClauses,
@@ -545,8 +545,6 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 	RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
 	double		save_fraction = root->tuple_fraction;
 	List	   *pathlist;
-	List	   *child_tlists1;
-	List	   *child_tlists2;
 	List	   *tlist_list;
 	List	   *tlist;
 	Path	   *path;
@@ -571,13 +569,7 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 	 * only one Append and unique-ification for the lot.  Recurse to find such
 	 * nodes and compute their children's paths.
 	 */
-	pathlist = list_concat(recurse_union_children(op->larg, root,
-												  op, refnames_tlist,
-												  &child_tlists1),
-						   recurse_union_children(op->rarg, root,
-												  op, refnames_tlist,
-												  &child_tlists2));
-	tlist_list = list_concat(child_tlists1, child_tlists2);
+	pathlist = plan_union_children(root, op, refnames_tlist, &tlist_list);
 
 	/*
 	 * Generate tlist for Append plan node.
@@ -797,56 +789,58 @@ generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
  * generate_union_path will force a fresh sort if the top level is a UNION.
  */
 static List *
-recurse_union_children(Node *setOp, PlannerInfo *root,
-					   SetOperationStmt *top_union,
-					   List *refnames_tlist,
-					   List **tlist_list)
+plan_union_children(PlannerInfo *root,
+					SetOperationStmt *top_union,
+					List *refnames_tlist,
+					List **tlist_list)
 {
-	List	   *result;
+	List	   *pending_rels = list_make1(top_union);
+	List	   *result = NIL;
 	List	   *child_tlist;
 
-	if (IsA(setOp, SetOperationStmt))
+	*tlist_list = NIL;
+
+	while (pending_rels != NIL)
 	{
-		SetOperationStmt *op = (SetOperationStmt *) setOp;
+		Node	   *setOp = linitial(pending_rels);
+
+		pending_rels = list_delete_first(pending_rels);
 
-		if (op->op == top_union->op &&
-			(op->all == top_union->all || op->all) &&
-			equal(op->colTypes, top_union->colTypes))
+		if (IsA(setOp, SetOperationStmt))
 		{
-			/* Same UNION, so fold children into parent's subpath list */
-			List	   *child_tlists1;
-			List	   *child_tlists2;
+			SetOperationStmt *op = (SetOperationStmt *) setOp;
 
-			result = list_concat(recurse_union_children(op->larg, root,
-														top_union,
-														refnames_tlist,
-														&child_tlists1),
-								 recurse_union_children(op->rarg, root,
-														top_union,
-														refnames_tlist,
-														&child_tlists2));
-			*tlist_list = list_concat(child_tlists1, child_tlists2);
-			return result;
+			if (op->op == top_union->op &&
+				(op->all == top_union->all || op->all) &&
+				equal(op->colTypes, top_union->colTypes))
+			{
+				/* Same UNION, so fold children into parent */
+				pending_rels = lcons(op->rarg, pending_rels);
+				pending_rels = lcons(op->larg, pending_rels);
+				continue;
+			}
 		}
+
+		/*
+		 * Not same, so plan this child separately.
+		 *
+		 * Note we disallow any resjunk columns in child results.  This is
+		 * necessary since the Append node that implements the union won't do
+		 * any projection, and upper levels will get confused if some of our
+		 * output tuples have junk and some don't.  This case only arises when
+		 * we have an EXCEPT or INTERSECT as child, else there won't be
+		 * resjunk anyway.
+		 */
+		result = lappend(result, recurse_set_operations(setOp, root,
+														top_union->colTypes,
+														top_union->colCollations,
+														false, -1,
+														refnames_tlist,
+														&child_tlist,
+														NULL));
+		*tlist_list = lappend(*tlist_list, child_tlist);
 	}
 
-	/*
-	 * Not same, so plan this child separately.
-	 *
-	 * Note we disallow any resjunk columns in child results.  This is
-	 * necessary since the Append node that implements the union won't do any
-	 * projection, and upper levels will get confused if some of our output
-	 * tuples have junk and some don't.  This case only arises when we have an
-	 * EXCEPT or INTERSECT as child, else there won't be resjunk anyway.
-	 */
-	result = list_make1(recurse_set_operations(setOp, root,
-											   top_union->colTypes,
-											   top_union->colCollations,
-											   false, -1,
-											   refnames_tlist,
-											   &child_tlist,
-											   NULL));
-	*tlist_list = list_make1(child_tlist);
 	return result;
 }
 
-- 
1.7.9.5

0003-Generate-a-separate-upper-relation-for-each-stage-of.patchtext/x-patch; charset=US-ASCII; name=0003-Generate-a-separate-upper-relation-for-each-stage-of.patchDownload

From f3e3f4b2a8b58c559c2759a99c71187ff3190353 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Fri, 23 Feb 2018 15:42:51 -0500
Subject: [PATCH 3/4] Generate a separate upper relation for each stage of
 setop planning.

Commit 3fc6e2d7f5b652b417fa6937c34de2438d60fa9f made setop planning
stages return paths rather than plans, but all such paths were loosely
associated with a single RelOptInfo, and only the final path was added
to the RelOptInfo.  Even at the time, it was foreseen that this should
be changed, because there is otherwise no good way for a single stage
of setop planning to return multiple paths.  With this patch, each
stage of set operation planning now creates a separate RelOptInfo;
these are distinguished by using appropriate relid sets.  Note that
this patch does nothing whatsoever about actually returning multiple
paths for the same set operation; it just makes it possible for a
future patch to do so.

Along the way, adjust things so that create_upper_paths_hook is called
for each of these new RelOptInfos rather than just once, since that
might be useful to extensions using that hook.  It might be a good
to provide an FDW API here as well, but I didn't try to do that for
now.

Patch by me, reviewed by Ashutosh Bapat.
---
 src/backend/optimizer/prep/prepunion.c |  340 ++++++++++++++++++--------------
 1 file changed, 190 insertions(+), 150 deletions(-)

diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index f387387..f087369 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -60,30 +60,29 @@ typedef struct
 	AppendRelInfo **appinfos;
 } adjust_appendrel_attrs_context;
 
-static Path *recurse_set_operations(Node *setOp, PlannerInfo *root,
+static RelOptInfo *recurse_set_operations(Node *setOp, PlannerInfo *root,
 					   List *colTypes, List *colCollations,
 					   bool junkOK,
 					   int flag, List *refnames_tlist,
 					   List **pTargetList,
 					   double *pNumGroups);
-static Path *generate_recursion_path(SetOperationStmt *setOp,
+static RelOptInfo *generate_recursion_path(SetOperationStmt *setOp,
 						PlannerInfo *root,
 						List *refnames_tlist,
 						List **pTargetList);
-static Path *generate_union_path(SetOperationStmt *op, PlannerInfo *root,
-					List *refnames_tlist,
-					List **pTargetList,
-					double *pNumGroups);
-static Path *generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
-					   List *refnames_tlist,
-					   List **pTargetList,
-					   double *pNumGroups);
+static RelOptInfo *generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
+					 List *refnames_tlist,
+					 List **pTargetList);
+static RelOptInfo *generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *root,
+						List *refnames_tlist,
+						List **pTargetList);
 static List *plan_union_children(PlannerInfo *root,
 					SetOperationStmt *top_union,
 					List *refnames_tlist,
 					List **tlist_list);
 static Path *make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
 				  PlannerInfo *root);
+static void postprocess_setop_rel(PlannerInfo *root, RelOptInfo *rel);
 static bool choose_hashed_setop(PlannerInfo *root, List *groupClauses,
 					Path *input_path,
 					double dNumGroups, double dNumOutputRows,
@@ -149,7 +148,6 @@ plan_set_operations(PlannerInfo *root)
 	RangeTblEntry *leftmostRTE;
 	Query	   *leftmostQuery;
 	RelOptInfo *setop_rel;
-	Path	   *path;
 	List	   *top_tlist;
 
 	Assert(topop);
@@ -182,56 +180,33 @@ plan_set_operations(PlannerInfo *root)
 	Assert(leftmostQuery != NULL);
 
 	/*
-	 * We return our results in the (SETOP, NULL) upperrel.  For the moment,
-	 * this is also the parent rel of all Paths in the setop tree; we may well
-	 * change that in future.
-	 */
-	setop_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
-
-	/*
-	 * We don't currently worry about setting setop_rel's consider_parallel
-	 * flag, nor about allowing FDWs to contribute paths to it.
-	 */
-
-	/*
 	 * If the topmost node is a recursive union, it needs special processing.
 	 */
 	if (root->hasRecursion)
 	{
-		path = generate_recursion_path(topop, root,
-									   leftmostQuery->targetList,
-									   &top_tlist);
+		setop_rel = generate_recursion_path(topop, root,
+											leftmostQuery->targetList,
+											&top_tlist);
 	}
 	else
 	{
 		/*
 		 * Recurse on setOperations tree to generate paths for set ops. The
-		 * final output path should have just the column types shown as the
+		 * final output paths should have just the column types shown as the
 		 * output from the top-level node, plus possibly resjunk working
 		 * columns (we can rely on upper-level nodes to deal with that).
 		 */
-		path = recurse_set_operations((Node *) topop, root,
-									  topop->colTypes, topop->colCollations,
-									  true, -1,
-									  leftmostQuery->targetList,
-									  &top_tlist,
-									  NULL);
+		setop_rel = recurse_set_operations((Node *) topop, root,
+										   topop->colTypes, topop->colCollations,
+										   true, -1,
+										   leftmostQuery->targetList,
+										   &top_tlist,
+										   NULL);
 	}
 
 	/* Must return the built tlist into root->processed_tlist. */
 	root->processed_tlist = top_tlist;
 
-	/* Add only the final path to the SETOP upperrel. */
-	add_path(setop_rel, path);
-
-	/* Let extensions possibly add some more paths */
-	if (create_upper_paths_hook)
-		(*create_upper_paths_hook) (root, UPPERREL_SETOP,
-									NULL, setop_rel);
-
-	/* Select cheapest path */
-	set_cheapest(setop_rel);
-
 	return setop_rel;
 }
 
@@ -245,21 +220,21 @@ plan_set_operations(PlannerInfo *root)
  * flag: if >= 0, add a resjunk output column indicating value of flag
  * refnames_tlist: targetlist to take column names from
  *
- * Returns a path for the subtree, as well as these output parameters:
+ * Returns a RelOptInfo for the subtree, as well as these output parameters:
  * *pTargetList: receives the fully-fledged tlist for the subtree's top plan
  * *pNumGroups: if not NULL, we estimate the number of distinct groups
  *		in the result, and store it there
  *
  * The pTargetList output parameter is mostly redundant with the pathtarget
- * of the returned path, but for the moment we need it because much of the
- * logic in this file depends on flag columns being marked resjunk.  Pending
- * a redesign of how that works, this is the easy way out.
+ * of the returned RelOptInfo, but for the moment we need it because much of
+ * the logic in this file depends on flag columns being marked resjunk.
+ * Pending a redesign of how that works, this is the easy way out.
  *
  * We don't have to care about typmods here: the only allowed difference
  * between set-op input and output typmods is input is a specific typmod
  * and output is -1, and that does not require a coercion.
  */
-static Path *
+static RelOptInfo *
 recurse_set_operations(Node *setOp, PlannerInfo *root,
 					   List *colTypes, List *colCollations,
 					   bool junkOK,
@@ -267,6 +242,8 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 					   List **pTargetList,
 					   double *pNumGroups)
 {
+	RelOptInfo *rel = NULL;		/* keep compiler quiet */
+
 	/* Guard against stack overflow due to overly complex setop nests */
 	check_stack_depth();
 
@@ -275,7 +252,6 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		RangeTblRef *rtr = (RangeTblRef *) setOp;
 		RangeTblEntry *rte = root->simple_rte_array[rtr->rtindex];
 		Query	   *subquery = rte->subquery;
-		RelOptInfo *rel;
 		PlannerInfo *subroot;
 		RelOptInfo *final_rel;
 		Path	   *subpath;
@@ -284,11 +260,7 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 
 		Assert(subquery != NULL);
 
-		/*
-		 * We need to build a RelOptInfo for each leaf subquery.  This isn't
-		 * used for much here, but it carries the subroot data structures
-		 * forward to setrefs.c processing.
-		 */
+		/* Build a RelOptInfo for this leaf subquery. */
 		rel = build_simple_rel(root, rtr->rtindex, NULL);
 
 		/* plan_params should not be in use in current query level */
@@ -307,6 +279,18 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		if (root->plan_params)
 			elog(ERROR, "unexpected outer reference in set operation subquery");
 
+		/* Figure out the appropriate target list for this subquery. */
+		tlist = generate_setop_tlist(colTypes, colCollations,
+									 flag,
+									 rtr->rtindex,
+									 true,
+									 subroot->processed_tlist,
+									 refnames_tlist);
+		rel->reltarget = create_pathtarget(root, tlist);
+
+		/* Return the fully-fledged tlist to caller, too */
+		*pTargetList = tlist;
+
 		/*
 		 * Mark rel with estimated output rows, width, etc.  Note that we have
 		 * to do this before generating outer-query paths, else
@@ -334,22 +318,7 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		path = (Path *) create_subqueryscan_path(root, rel, subpath,
 												 NIL, NULL);
 
-		/*
-		 * Figure out the appropriate target list, and update the
-		 * SubqueryScanPath with the PathTarget form of that.
-		 */
-		tlist = generate_setop_tlist(colTypes, colCollations,
-									 flag,
-									 rtr->rtindex,
-									 true,
-									 subroot->processed_tlist,
-									 refnames_tlist);
-
-		path = apply_projection_to_path(root, rel, path,
-										create_pathtarget(root, tlist));
-
-		/* Return the fully-fledged tlist to caller, too */
-		*pTargetList = tlist;
+		add_path(rel, path);
 
 		/*
 		 * Estimate number of groups if caller wants it.  If the subquery used
@@ -378,25 +347,22 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 												  subpath->rows,
 												  NULL);
 		}
-
-		return (Path *) path;
 	}
 	else if (IsA(setOp, SetOperationStmt))
 	{
 		SetOperationStmt *op = (SetOperationStmt *) setOp;
-		Path	   *path;
 
 		/* UNIONs are much different from INTERSECT/EXCEPT */
 		if (op->op == SETOP_UNION)
-			path = generate_union_path(op, root,
+			rel = generate_union_paths(op, root,
 									   refnames_tlist,
-									   pTargetList,
-									   pNumGroups);
+									   pTargetList);
 		else
-			path = generate_nonunion_path(op, root,
+			rel = generate_nonunion_paths(op, root,
 										  refnames_tlist,
-										  pTargetList,
-										  pNumGroups);
+										  pTargetList);
+		if (pNumGroups)
+			*pNumGroups = rel->rows;
 
 		/*
 		 * If necessary, add a Result node to project the caller-requested
@@ -415,39 +381,70 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 			!tlist_same_datatypes(*pTargetList, colTypes, junkOK) ||
 			!tlist_same_collations(*pTargetList, colCollations, junkOK))
 		{
+			PathTarget *target;
+			ListCell   *lc;
+
 			*pTargetList = generate_setop_tlist(colTypes, colCollations,
 												flag,
 												0,
 												false,
 												*pTargetList,
 												refnames_tlist);
-			path = apply_projection_to_path(root,
-											path->parent,
-											path,
-											create_pathtarget(root,
-															  *pTargetList));
+			target = create_pathtarget(root, *pTargetList);
+
+			/* Apply projection to each path */
+			foreach(lc, rel->pathlist)
+			{
+				Path	   *subpath = (Path *) lfirst(lc);
+				Path	   *path;
+
+				Assert(subpath->param_info == NULL);
+				path = apply_projection_to_path(root, subpath->parent,
+												subpath, target);
+				/* If we had to add a Result, path is different from subpath */
+				if (path != subpath)
+					lfirst(lc) = path;
+			}
+
+			/* Apply projection to each partial path */
+			foreach(lc, rel->partial_pathlist)
+			{
+				Path	   *subpath = (Path *) lfirst(lc);
+				Path	   *path;
+
+				Assert(subpath->param_info == NULL);
+
+				/* avoid apply_projection_to_path, in case of multiple refs */
+				path = (Path *) create_projection_path(root, subpath->parent,
+													   subpath, target);
+				lfirst(lc) = path;
+			}
 		}
-		return path;
 	}
 	else
 	{
 		elog(ERROR, "unrecognized node type: %d",
 			 (int) nodeTag(setOp));
 		*pTargetList = NIL;
-		return NULL;			/* keep compiler quiet */
 	}
+
+	postprocess_setop_rel(root, rel);
+
+	return rel;
 }
 
 /*
- * Generate path for a recursive UNION node
+ * Generate paths for a recursive UNION node
  */
-static Path *
+static RelOptInfo *
 generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 						List *refnames_tlist,
 						List **pTargetList)
 {
-	RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
+	RelOptInfo *result_rel;
 	Path	   *path;
+	RelOptInfo *lrel,
+			   *rrel;
 	Path	   *lpath;
 	Path	   *rpath;
 	List	   *lpath_tlist;
@@ -466,20 +463,22 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 	 * Unlike a regular UNION node, process the left and right inputs
 	 * separately without any intention of combining them into one Append.
 	 */
-	lpath = recurse_set_operations(setOp->larg, root,
-								   setOp->colTypes, setOp->colCollations,
-								   false, -1,
-								   refnames_tlist,
-								   &lpath_tlist,
-								   NULL);
+	lrel = recurse_set_operations(setOp->larg, root,
+								  setOp->colTypes, setOp->colCollations,
+								  false, -1,
+								  refnames_tlist,
+								  &lpath_tlist,
+								  NULL);
+	lpath = lrel->cheapest_total_path;
 	/* The right path will want to look at the left one ... */
 	root->non_recursive_path = lpath;
-	rpath = recurse_set_operations(setOp->rarg, root,
-								   setOp->colTypes, setOp->colCollations,
-								   false, -1,
-								   refnames_tlist,
-								   &rpath_tlist,
-								   NULL);
+	rrel = recurse_set_operations(setOp->rarg, root,
+								  setOp->colTypes, setOp->colCollations,
+								  false, -1,
+								  refnames_tlist,
+								  &rpath_tlist,
+								  NULL);
+	rpath = rrel->cheapest_total_path;
 	root->non_recursive_path = NULL;
 
 	/*
@@ -491,6 +490,11 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 
 	*pTargetList = tlist;
 
+	/* Build result relation. */
+	result_rel = fetch_upper_rel(root, UPPERREL_SETOP,
+								 bms_union(lrel->relids, rrel->relids));
+	result_rel->reltarget = create_pathtarget(root, tlist);
+
 	/*
 	 * If UNION, identify the grouping operators
 	 */
@@ -525,26 +529,30 @@ generate_recursion_path(SetOperationStmt *setOp, PlannerInfo *root,
 											   result_rel,
 											   lpath,
 											   rpath,
-											   create_pathtarget(root, tlist),
+											   result_rel->reltarget,
 											   groupList,
 											   root->wt_param_id,
 											   dNumGroups);
 
-	return path;
+	add_path(result_rel, path);
+	postprocess_setop_rel(root, result_rel);
+	return result_rel;
 }
 
 /*
- * Generate path for a UNION or UNION ALL node
+ * Generate paths for a UNION or UNION ALL node
  */
-static Path *
-generate_union_path(SetOperationStmt *op, PlannerInfo *root,
-					List *refnames_tlist,
-					List **pTargetList,
-					double *pNumGroups)
+static RelOptInfo *
+generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
+					 List *refnames_tlist,
+					 List **pTargetList)
 {
-	RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
+	Relids		relids = NULL;
+	RelOptInfo *result_rel;
 	double		save_fraction = root->tuple_fraction;
-	List	   *pathlist;
+	ListCell   *lc;
+	List	   *pathlist = NIL;
+	List	   *rellist;
 	List	   *tlist_list;
 	List	   *tlist;
 	Path	   *path;
@@ -569,7 +577,7 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 	 * only one Append and unique-ification for the lot.  Recurse to find such
 	 * nodes and compute their children's paths.
 	 */
-	pathlist = plan_union_children(root, op, refnames_tlist, &tlist_list);
+	rellist = plan_union_children(root, op, refnames_tlist, &tlist_list);
 
 	/*
 	 * Generate tlist for Append plan node.
@@ -583,13 +591,24 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 
 	*pTargetList = tlist;
 
+	/* Build path list and relid set. */
+	foreach(lc, rellist)
+	{
+		RelOptInfo *rel = lfirst(lc);
+
+		pathlist = lappend(pathlist, rel->cheapest_total_path);
+		relids = bms_union(relids, rel->relids);
+	}
+
+	/* Build result relation. */
+	result_rel = fetch_upper_rel(root, UPPERREL_SETOP, relids);
+	result_rel->reltarget = create_pathtarget(root, tlist);
+
 	/*
 	 * Append the child results together.
 	 */
 	path = (Path *) create_append_path(result_rel, pathlist, NIL,
 									   NULL, 0, false, NIL, -1);
-	/* We have to manually jam the right tlist into the path; ick */
-	path->pathtarget = create_pathtarget(root, tlist);
 
 	/*
 	 * For UNION ALL, we just need the Append path.  For UNION, need to add
@@ -598,30 +617,32 @@ generate_union_path(SetOperationStmt *op, PlannerInfo *root,
 	if (!op->all)
 		path = make_union_unique(op, path, tlist, root);
 
+	add_path(result_rel, path);
+
 	/*
-	 * Estimate number of groups if caller wants it.  For now we just assume
-	 * the output is unique --- this is certainly true for the UNION case, and
-	 * we want worst-case estimates anyway.
+	 * Estimate number of groups.  For now we just assume the output is unique
+	 * --- this is certainly true for the UNION case, and we want worst-case
+	 * estimates anyway.
 	 */
-	if (pNumGroups)
-		*pNumGroups = path->rows;
+	result_rel->rows = path->rows;
 
 	/* Undo effects of possibly forcing tuple_fraction to 0 */
 	root->tuple_fraction = save_fraction;
 
-	return path;
+	return result_rel;
 }
 
 /*
- * Generate path for an INTERSECT, INTERSECT ALL, EXCEPT, or EXCEPT ALL node
+ * Generate paths for an INTERSECT, INTERSECT ALL, EXCEPT, or EXCEPT ALL node
  */
-static Path *
-generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
-					   List *refnames_tlist,
-					   List **pTargetList,
-					   double *pNumGroups)
+static RelOptInfo *
+generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *root,
+						List *refnames_tlist,
+						List **pTargetList)
 {
-	RelOptInfo *result_rel = fetch_upper_rel(root, UPPERREL_SETOP, NULL);
+	RelOptInfo *result_rel;
+	RelOptInfo *lrel,
+			   *rrel;
 	double		save_fraction = root->tuple_fraction;
 	Path	   *lpath,
 			   *rpath,
@@ -646,18 +667,20 @@ generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
 	root->tuple_fraction = 0.0;
 
 	/* Recurse on children, ensuring their outputs are marked */
-	lpath = recurse_set_operations(op->larg, root,
-								   op->colTypes, op->colCollations,
-								   false, 0,
-								   refnames_tlist,
-								   &lpath_tlist,
-								   &dLeftGroups);
-	rpath = recurse_set_operations(op->rarg, root,
-								   op->colTypes, op->colCollations,
-								   false, 1,
-								   refnames_tlist,
-								   &rpath_tlist,
-								   &dRightGroups);
+	lrel = recurse_set_operations(op->larg, root,
+								  op->colTypes, op->colCollations,
+								  false, 0,
+								  refnames_tlist,
+								  &lpath_tlist,
+								  &dLeftGroups);
+	lpath = lrel->cheapest_total_path;
+	rrel = recurse_set_operations(op->rarg, root,
+								  op->colTypes, op->colCollations,
+								  false, 1,
+								  refnames_tlist,
+								  &rpath_tlist,
+								  &dRightGroups);
+	rpath = rrel->cheapest_total_path;
 
 	/* Undo effects of forcing tuple_fraction to 0 */
 	root->tuple_fraction = save_fraction;
@@ -695,15 +718,17 @@ generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
 
 	*pTargetList = tlist;
 
+	/* Build result relation. */
+	result_rel = fetch_upper_rel(root, UPPERREL_SETOP,
+								 bms_union(lrel->relids, rrel->relids));
+	result_rel->reltarget = create_pathtarget(root, tlist);;
+
 	/*
 	 * Append the child results together.
 	 */
 	path = (Path *) create_append_path(result_rel, pathlist, NIL,
 									   NULL, 0, false, NIL, -1);
 
-	/* We have to manually jam the right tlist into the path; ick */
-	path->pathtarget = create_pathtarget(root, tlist);
-
 	/* Identify the grouping semantics */
 	groupList = generate_setop_grouplist(op, tlist);
 
@@ -769,10 +794,9 @@ generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
 									  dNumGroups,
 									  dNumOutputRows);
 
-	if (pNumGroups)
-		*pNumGroups = dNumGroups;
-
-	return path;
+	result_rel->rows = path->rows;
+	add_path(result_rel, path);
+	return result_rel;
 }
 
 /*
@@ -786,7 +810,7 @@ generate_nonunion_path(SetOperationStmt *op, PlannerInfo *root,
  * collations have the same notion of equality.  It is valid from an
  * implementation standpoint because we don't care about the ordering of
  * a UNION child's result: UNION ALL results are always unordered, and
- * generate_union_path will force a fresh sort if the top level is a UNION.
+ * generate_union_paths will force a fresh sort if the top level is a UNION.
  */
 static List *
 plan_union_children(PlannerInfo *root,
@@ -897,8 +921,6 @@ make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
 															   groupList,
 															   tlist),
 								 -1.0);
-		/* We have to manually jam the right tlist into the path; ick */
-		path->pathtarget = create_pathtarget(root, tlist);
 		path = (Path *) create_upper_unique_path(root,
 												 result_rel,
 												 path,
@@ -910,6 +932,24 @@ make_union_unique(SetOperationStmt *op, Path *path, List *tlist,
 }
 
 /*
+ * postprocess_setop_rel - perform steps required after adding paths
+ */
+static void
+postprocess_setop_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	/*
+	 * We don't currently worry about allowing FDWs to contribute paths to
+	 * this relation, but give extensions a chance.
+	 */
+	if (create_upper_paths_hook)
+		(*create_upper_paths_hook) (root, UPPERREL_SETOP,
+									NULL, rel);
+
+	/* Select cheapest path */
+	set_cheapest(rel);
+}
+
+/*
  * choose_hashed_setop - should we use hashing for a set operation?
  */
 static bool
-- 
1.7.9.5

0004-Consider-Parallel-Append-as-a-way-to-implement-a-uni.patchtext/x-patch; charset=US-ASCII; name=0004-Consider-Parallel-Append-as-a-way-to-implement-a-uni.patchDownload

From 7525f0baf1ea58544364c046d929d17a71abf84a Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Sat, 23 Dec 2017 10:56:07 -0800
Subject: [PATCH 4/4] Consider Parallel Append as a way to implement a union
 operation in a setop tree.

Patch by me, reviewed by Ashutosh Bapat.
---
 src/backend/optimizer/prep/prepunion.c |   93 +++++++++++++++++++++++++++++++-
 1 file changed, 91 insertions(+), 2 deletions(-)

diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index f087369..0114799 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -299,11 +299,17 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		set_subquery_size_estimates(root, rel);
 
 		/*
+		 * Since we may want to add a partial path to this relation, we must
+		 * set its consider_parallel flag correctly.
+		 */
+		final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
+		rel->consider_parallel = final_rel->consider_parallel;
+
+		/*
 		 * For the moment, we consider only a single Path for the subquery.
 		 * This should change soon (make it look more like
 		 * set_subquery_pathlist).
 		 */
-		final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
 		subpath = get_cheapest_fractional_path(final_rel,
 											   root->tuple_fraction);
 
@@ -321,6 +327,23 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
 		add_path(rel, path);
 
 		/*
+		 * If we have a partial path for the child relation, we can use that
+		 * to build a partial path for this relation.  But there's no point in
+		 * considering any path but the cheapest.
+		 */
+		if (final_rel->partial_pathlist != NIL)
+		{
+			Path	   *partial_subpath;
+			Path	   *partial_path;
+
+			partial_subpath = linitial(final_rel->partial_pathlist);
+			partial_path = (Path *)
+				create_subqueryscan_path(root, rel, partial_subpath,
+										 NIL, NULL);
+			add_partial_path(rel, partial_path);
+		}
+
+		/*
 		 * Estimate number of groups if caller wants it.  If the subquery used
 		 * grouping or aggregation, its output is probably mostly unique
 		 * anyway; otherwise do statistical estimation.
@@ -552,6 +575,9 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	double		save_fraction = root->tuple_fraction;
 	ListCell   *lc;
 	List	   *pathlist = NIL;
+	List	   *partial_pathlist = NIL;
+	bool		partial_paths_valid = true;
+	bool		consider_parallel = true;
 	List	   *rellist;
 	List	   *tlist_list;
 	List	   *tlist;
@@ -591,18 +617,34 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 
 	*pTargetList = tlist;
 
-	/* Build path list and relid set. */
+	/* Build path lists and relid set. */
 	foreach(lc, rellist)
 	{
 		RelOptInfo *rel = lfirst(lc);
 
 		pathlist = lappend(pathlist, rel->cheapest_total_path);
+
+		if (consider_parallel)
+		{
+			if (!rel->consider_parallel)
+			{
+				consider_parallel = false;
+				partial_paths_valid = false;
+			}
+			else if (rel->partial_pathlist == NIL)
+				partial_paths_valid = false;
+			else
+				partial_pathlist = lappend(partial_pathlist,
+										   linitial(rel->partial_pathlist));
+		}
+
 		relids = bms_union(relids, rel->relids);
 	}
 
 	/* Build result relation. */
 	result_rel = fetch_upper_rel(root, UPPERREL_SETOP, relids);
 	result_rel->reltarget = create_pathtarget(root, tlist);
+	result_rel->consider_parallel = consider_parallel;
 
 	/*
 	 * Append the child results together.
@@ -626,6 +668,53 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
 	 */
 	result_rel->rows = path->rows;
 
+	/*
+	 * Now consider doing the same thing using the partial paths plus Append
+	 * plus Gather.
+	 */
+	if (partial_paths_valid)
+	{
+		Path	   *ppath;
+		ListCell   *lc;
+		int			parallel_workers = 0;
+
+		/* Find the highest number of workers requested for any subpath. */
+		foreach(lc, partial_pathlist)
+		{
+			Path	   *path = lfirst(lc);
+
+			parallel_workers = Max(parallel_workers, path->parallel_workers);
+		}
+		Assert(parallel_workers > 0);
+
+		/*
+		 * If the use of parallel append is permitted, always request at least
+		 * log2(# of children) paths.  We assume it can be useful to have
+		 * extra workers in this case because they will be spread out across
+		 * the children.  The precise formula is just a guess; see
+		 * add_paths_to_append_rel.
+		 */
+		if (enable_parallel_append)
+		{
+			parallel_workers = Max(parallel_workers,
+								   fls(list_length(partial_pathlist)));
+			parallel_workers = Min(parallel_workers,
+								   max_parallel_workers_per_gather);
+		}
+		Assert(parallel_workers > 0);
+
+		ppath = (Path *)
+			create_append_path(result_rel, NIL, partial_pathlist,
+							   NULL, parallel_workers, enable_parallel_append,
+							   NIL /* XXX? Is this right? */ , -1);
+		ppath = (Path *)
+			create_gather_path(root, result_rel, ppath,
+							   result_rel->reltarget, NULL, NULL);
+		if (!op->all)
+			ppath = make_union_unique(op, ppath, tlist, root);
+		add_path(result_rel, ppath);
+	}
+
 	/* Undo effects of possibly forcing tuple_fraction to 0 */
 	root->tuple_fraction = save_fraction;
 
-- 
1.7.9.5

#14

Robert Haas

robertmhaas@gmail.com

almost 8 years ago

In reply to: Ashutosh Bapat (#13)

Re: parallel append vs. simple UNION ALL

On Tue, Mar 13, 2018 at 12:35 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

It looks like it was not changed in all the places. make falied. I
have fixed all the instances of these two functions in the attached
patchset (only 0003 changes). Please check.

Oops. Thanks.

I'm going to go ahead and commit 0001 here. Any more thoughts on the rest?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#15

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

almost 8 years ago

In reply to: Robert Haas (#14)

Re: parallel append vs. simple UNION ALL

On Wed, Mar 14, 2018 at 2:09 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Mar 13, 2018 at 12:35 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

It looks like it was not changed in all the places. make falied. I
have fixed all the instances of these two functions in the attached
patchset (only 0003 changes). Please check.

Oops. Thanks.

I'm going to go ahead and commit 0001 here. Any more thoughts on the rest?

Nope. I am good with the patchset.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

#16

Rajkumar Raghuwanshi

rajkumar.raghuwanshi@enterprisedb.com

almost 8 years ago

In reply to: Robert Haas (#11)

Re: parallel append vs. simple UNION ALL

On Fri, Mar 9, 2018 at 1:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Great. Committed 0001. Are you planning any further testing of this
patch series?

Sorry I missed the mail.
Yes, I have further tested patches and find no more issues.

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation

#17

Robert Haas

robertmhaas@gmail.com

almost 8 years ago

In reply to: Rajkumar Raghuwanshi (#16)

Re: parallel append vs. simple UNION ALL

On Fri, Mar 16, 2018 at 7:35 AM, Rajkumar Raghuwanshi
<rajkumar.raghuwanshi@enterprisedb.com> wrote:

On Fri, Mar 9, 2018 at 1:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Great. Committed 0001. Are you planning any further testing of this
patch series?

Sorry I missed the mail.
Yes, I have further tested patches and find no more issues.

OK, thanks to both you and Ashutosh Bapat. Committed 0002 and 0003.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#18

Robert Haas

robertmhaas@gmail.com

almost 8 years ago

In reply to: Robert Haas (#17)

Re: parallel append vs. simple UNION ALL

On Mon, Mar 19, 2018 at 11:57 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 16, 2018 at 7:35 AM, Rajkumar Raghuwanshi
<rajkumar.raghuwanshi@enterprisedb.com> wrote:

On Fri, Mar 9, 2018 at 1:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Great. Committed 0001. Are you planning any further testing of this
patch series?

Sorry I missed the mail.
Yes, I have further tested patches and find no more issues.

OK, thanks to both you and Ashutosh Bapat. Committed 0002 and 0003.

And now committed 0004. This area could use significantly more work,
but I think it's better to have this much than not.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company