pgsql: Support Parallel Append plan nodes.

Started by Robert Haasabout 8 years ago12 messages
#1Robert Haas
rhaas@postgresql.org

Support Parallel Append plan nodes.

When we create an Append node, we can spread out the workers over the
subplans instead of piling on to each subplan one at a time, which
should typically be a bit more efficient, both because the startup
cost of any plan executed entirely by one worker is paid only once and
also because of reduced contention. We can also construct Append
plans using a mix of partial and non-partial subplans, which may allow
for parallelism in places that otherwise couldn't support it.
Unfortunately, this patch doesn't handle the important case of
parallelizing UNION ALL by running each branch in a separate worker;
the executor infrastructure is added here, but more planner work is
needed.

Amit Khandekar, Robert Haas, Amul Sul, reviewed and tested by
Ashutosh Bapat, Amit Langote, Rafia Sabih, Amit Kapila, and
Rajkumar Raghuwanshi.

Discussion: /messages/by-id/CAJ3gD9dy0K_E8r727heqXoBmWZ83HwLFwdcaSSmBQ1+S+vRuUQ@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/ab72716778128fb63d54ac256adf7fe6820a1185

Modified Files
--------------
doc/src/sgml/config.sgml | 14 ++
doc/src/sgml/monitoring.sgml | 7 +-
src/backend/executor/execParallel.c | 19 ++
src/backend/executor/nodeAppend.c | 331 +++++++++++++++++++++-----
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/list.c | 38 +++
src/backend/nodes/outfuncs.c | 1 +
src/backend/nodes/readfuncs.c | 1 +
src/backend/optimizer/path/allpaths.c | 216 ++++++++++++++---
src/backend/optimizer/path/costsize.c | 164 +++++++++++++
src/backend/optimizer/path/joinrels.c | 3 +-
src/backend/optimizer/plan/createplan.c | 10 +-
src/backend/optimizer/plan/planner.c | 5 +-
src/backend/optimizer/prep/prepunion.c | 7 +-
src/backend/optimizer/util/pathnode.c | 88 +++++--
src/backend/storage/lmgr/lwlock.c | 1 +
src/backend/utils/misc/guc.c | 9 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/executor/nodeAppend.h | 5 +
src/include/nodes/execnodes.h | 14 +-
src/include/nodes/pg_list.h | 3 +
src/include/nodes/plannodes.h | 1 +
src/include/nodes/relation.h | 6 +
src/include/optimizer/cost.h | 2 +
src/include/optimizer/pathnode.h | 9 +-
src/include/storage/lwlock.h | 1 +
src/test/regress/expected/inherit.out | 2 +
src/test/regress/expected/select_parallel.out | 91 ++++++-
src/test/regress/expected/sysviews.out | 3 +-
src/test/regress/sql/inherit.sql | 2 +
src/test/regress/sql/select_parallel.sql | 33 ++-
31 files changed, 959 insertions(+), 129 deletions(-)

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#1)
Re: pgsql: Support Parallel Append plan nodes.

Robert Haas <rhaas@postgresql.org> writes:

Support Parallel Append plan nodes.

Buildfarm member sidewinder doesn't like this:

Program terminated with signal SIGABRT, Aborted.
#0 0x00007f7ff5b0e01a in _lwp_kill () from /usr/lib/libc.so.12
#0 0x00007f7ff5b0e01a in _lwp_kill () from /usr/lib/libc.so.12
#1 0x00007f7ff5b0dca5 in abort () from /usr/lib/libc.so.12
#2 0x000000000082b5e3 in ExceptionalCondition (conditionName=conditionName@entry=0x9ad548 "!(append->first_partial_plan < node->as_nplans)", errorType=errorType@entry=0x87489a "FailedAssertion", fileName=fileName@entry=0x9ad4d6 "nodeAppend.c", lineNumber=lineNumber@entry=509) at assert.c:54
#3 0x000000000060feb6 in choose_next_subplan_for_worker (node=0x7f7ff7357980) at nodeAppend.c:509
#4 0x000000000061007d in ExecAppend (pstate=0x7f7ff7357980) at nodeAppend.c:222
#5 0x0000000000610464 in ExecProcNode (node=0x7f7ff7357980) at ../../../src/include/executor/executor.h:241
#6 fetch_input_tuple (aggstate=aggstate@entry=0x7f7ff7357a98) at nodeAgg.c:699
#7 0x0000000000612531 in agg_retrieve_direct (aggstate=0x7f7ff7357a98) at nodeAgg.c:2456
#8 ExecAgg (pstate=0x7f7ff7357a98) at nodeAgg.c:2166
#9 0x0000000000603cbd in ExecProcNode (node=0x7f7ff7357a98) at ../../../src/include/executor/executor.h:241
#10 ExecutePlan (execute_once=<optimized out>, dest=0x7f7ff73556b8, direction=<optimized out>, numberTuples=0, sendTuples=1 '\001', operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x7f7ff7357a98, estate=0x7f7ff7357040) at execMain.c:1718
#11 standard_ExecutorRun (queryDesc=0x7f7ff7369f58, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:361
#12 0x000000000060833b in ParallelQueryMain (seg=0x7f7ff7b65908, toc=0x7f7ff7fab000) at execParallel.c:1319
#13 0x00000000004f6e54 in ParallelWorkerMain (main_arg=<optimized out>) at parallel.c:1150
#14 0x00000000006ba07f in StartBackgroundWorker () at bgworker.c:841
#15 0x00000000006c51f9 in do_start_bgworker (rw=<optimized out>) at postmaster.c:5741

Append.first_partial_plan is sadly underdocumented, but considering that
the loop above this Assert is prepared for the possibility that
first_partial_plan >= node->as_nplans, I'm not sure why this code supposes
that that can't happen.

BTW, I find it confusing and bad style that some of the references
in this function to node->as_pstate go through the local variable
pstate but others do not.

regards, tom lane

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#2)
1 attachment(s)
Re: pgsql: Support Parallel Append plan nodes.

I wrote:

Robert Haas <rhaas@postgresql.org> writes:

Support Parallel Append plan nodes.

Buildfarm member sidewinder doesn't like this:

It transpires that my machine prairiedog also shows the failure.
I instrumented nodeAppend.c (as per attached patch) to see what
was going on, and I get this trace on prairiedog:

2017-12-05 22:50:30.228 EST [28812] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 0, nplans 6
2017-12-05 22:50:30.228 EST [28812] STATEMENT: select round(avg(aa)), sum(aa) from a_star a1;
2017-12-05 22:50:30.241 EST [28813] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 0, nplans 6
2017-12-05 22:50:30.241 EST [28813] STATEMENT: select round(avg(aa)), sum(aa) from a_star a1;
2017-12-05 22:50:30.252 EST [28814] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 0, nplans 6
2017-12-05 22:50:30.252 EST [28814] STATEMENT: select round(avg(aa)), sum(aa) from a_star a1;
2017-12-05 22:50:30.363 EST [28816] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan 0, fpp 2, nplans 6
2017-12-05 22:50:30.363 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.363 EST [28816] LOG: choose_next_subplan_for_worker: picking nextplan 0, fpp 2, nplans 6
2017-12-05 22:50:30.363 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.363 EST [28816] LOG: choose_next_subplan_for_worker: whichplan 0, nextplan 1, fpp 2, nplans 6
2017-12-05 22:50:30.363 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.364 EST [28816] LOG: choose_next_subplan_for_worker: picking nextplan 1, fpp 2, nplans 6
2017-12-05 22:50:30.364 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.364 EST [28816] LOG: choose_next_subplan_for_worker: whichplan 1, nextplan 2, fpp 2, nplans 6
2017-12-05 22:50:30.364 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.364 EST [28816] LOG: choose_next_subplan_for_worker: picking nextplan 2, fpp 2, nplans 6
2017-12-05 22:50:30.364 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.364 EST [28816] LOG: choose_next_subplan_for_worker: whichplan 2, nextplan 3, fpp 2, nplans 6
2017-12-05 22:50:30.364 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.364 EST [28816] LOG: choose_next_subplan_for_worker: picking nextplan 3, fpp 2, nplans 6
2017-12-05 22:50:30.364 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.365 EST [28816] LOG: choose_next_subplan_for_worker: whichplan 3, nextplan 4, fpp 2, nplans 6
2017-12-05 22:50:30.365 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.365 EST [28816] LOG: choose_next_subplan_for_worker: picking nextplan 4, fpp 2, nplans 6
2017-12-05 22:50:30.365 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.365 EST [28816] LOG: choose_next_subplan_for_worker: whichplan 4, nextplan 5, fpp 2, nplans 6
2017-12-05 22:50:30.365 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.365 EST [28816] LOG: choose_next_subplan_for_worker: picking nextplan 5, fpp 2, nplans 6
2017-12-05 22:50:30.365 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.365 EST [28816] LOG: choose_next_subplan_for_worker: whichplan 5, nextplan 2, fpp 2, nplans 6
2017-12-05 22:50:30.365 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.407 EST [28815] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 2, nplans 6
2017-12-05 22:50:30.407 EST [28815] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.474 EST [28817] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 2, nplans 6
2017-12-05 22:50:30.474 EST [28817] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.568 EST [28818] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan 0, fpp 6, nplans 6
2017-12-05 22:50:30.568 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.568 EST [28818] LOG: choose_next_subplan_for_worker: picking nextplan 0, fpp 6, nplans 6
2017-12-05 22:50:30.568 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.569 EST [28818] LOG: choose_next_subplan_for_worker: whichplan 0, nextplan 1, fpp 6, nplans 6
2017-12-05 22:50:30.569 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.569 EST [28818] LOG: choose_next_subplan_for_worker: picking nextplan 1, fpp 6, nplans 6
2017-12-05 22:50:30.569 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.569 EST [28818] LOG: choose_next_subplan_for_worker: whichplan 1, nextplan 2, fpp 6, nplans 6
2017-12-05 22:50:30.569 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.569 EST [28818] LOG: choose_next_subplan_for_worker: picking nextplan 2, fpp 6, nplans 6
2017-12-05 22:50:30.569 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.570 EST [28818] LOG: choose_next_subplan_for_worker: whichplan 2, nextplan 3, fpp 6, nplans 6
2017-12-05 22:50:30.570 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.570 EST [28818] LOG: choose_next_subplan_for_worker: picking nextplan 3, fpp 6, nplans 6
2017-12-05 22:50:30.570 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.570 EST [28818] LOG: choose_next_subplan_for_worker: whichplan 3, nextplan 4, fpp 6, nplans 6
2017-12-05 22:50:30.570 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.570 EST [28818] LOG: choose_next_subplan_for_worker: picking nextplan 4, fpp 6, nplans 6
2017-12-05 22:50:30.570 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.570 EST [28818] LOG: choose_next_subplan_for_worker: whichplan 4, nextplan 5, fpp 6, nplans 6
2017-12-05 22:50:30.570 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.570 EST [28818] LOG: choose_next_subplan_for_worker: picking nextplan 5, fpp 6, nplans 6
2017-12-05 22:50:30.570 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
TRAP: FailedAssertion("!(append->first_partial_plan < node->as_nplans)", File: "nodeAppend.c", Line: 520)
2017-12-05 22:50:38.211 EST [28158] LOG: background worker "parallel worker" (PID 28818) was terminated by signal 6: Abort trap
2017-12-05 22:50:38.211 EST [28158] DETAIL: Failed process was running: select round(avg(aa)), sum(aa) from a_star a3;

That makes it pretty clear what's going wrong, but why don't we see it
elsewhere?

On my development machine, the same patch yields

2017-12-05 22:42:24.187 EST [843] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 0, nplans 6
2017-12-05 22:42:24.187 EST [843] STATEMENT: select round(avg(aa)), sum(aa) from a_star a1;
2017-12-05 22:42:24.188 EST [844] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 0, nplans 6
2017-12-05 22:42:24.188 EST [844] STATEMENT: select round(avg(aa)), sum(aa) from a_star a1;
2017-12-05 22:42:24.188 EST [845] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 0, nplans 6
2017-12-05 22:42:24.188 EST [845] STATEMENT: select round(avg(aa)), sum(aa) from a_star a1;
2017-12-05 22:42:24.197 EST [846] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 2, nplans 6
2017-12-05 22:42:24.197 EST [846] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:42:24.200 EST [847] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 2, nplans 6
2017-12-05 22:42:24.200 EST [847] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:42:24.200 EST [848] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 2, nplans 6
2017-12-05 22:42:24.200 EST [848] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:42:24.214 EST [849] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 6, nplans 6
2017-12-05 22:42:24.214 EST [849] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:42:24.214 EST [852] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 6, nplans 6
2017-12-05 22:42:24.214 EST [852] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:42:24.214 EST [853] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 6, nplans 6
2017-12-05 22:42:24.214 EST [853] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;

I conclude that the reason we don't see the crash except on the slowest
buildfarm critters is that on most machines the leader process manages
to finish off all the subplans before any of the workers get as far as
doing something. This doesn't give me a warm fuzzy feeling about how
much testing this patch has gotten.

regards, tom lane

Attachments:

instrument-choose_next_subplan_for_worker.patchtext/x-diff; charset=us-ascii; name=instrument-choose_next_subplan_for_worker.patchDownload
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 246a0b2..4c4355b 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -463,6 +463,12 @@ choose_next_subplan_for_worker(AppendState *node)
 
 	LWLockAcquire(&pstate->pa_lock, LW_EXCLUSIVE);
 
+	elog(LOG, "choose_next_subplan_for_worker: whichplan %d, nextplan %d, fpp %d, nplans %d",
+		 node->as_whichplan,
+		 pstate->pa_next_plan,
+		 append->first_partial_plan,
+		 node->as_nplans);
+
 	/* Mark just-completed subplan as finished. */
 	if (node->as_whichplan != INVALID_SUBPLAN_INDEX)
 		node->as_pstate->pa_finished[node->as_whichplan] = true;
@@ -502,6 +508,11 @@ choose_next_subplan_for_worker(AppendState *node)
 		}
 	}
 
+	elog(LOG, "choose_next_subplan_for_worker: picking nextplan %d, fpp %d, nplans %d",
+		 pstate->pa_next_plan,
+		 append->first_partial_plan,
+		 node->as_nplans);
+
 	/* Pick the plan we found, and advance pa_next_plan one more time. */
 	node->as_whichplan = pstate->pa_next_plan++;
 	if (pstate->pa_next_plan >= node->as_nplans)
#4Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#3)
Re: pgsql: Support Parallel Append plan nodes.

Hi,

On 2017-12-05 23:01:59 -0500, Tom Lane wrote:

I conclude that the reason we don't see the crash except on the slowest
buildfarm critters is that on most machines the leader process manages
to finish off all the subplans before any of the workers get as far as
doing something. This doesn't give me a warm fuzzy feeling about how
much testing this patch has gotten.

If that's the case, it should be reproducible on fast machines if one
sets parallel_leader_participation = off during an installcheck. Which
indeed crashes here, albeit on a heavily modified tree:

#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0x00007f4c05afb19a in __GI_abort () at abort.c:89
#2 0x000055d89651e83a in ExceptionalCondition (conditionName=0x55d8966edef8 "!(append->first_partial_plan < node->as_nplans)",
errorType=0x55d8966edd86 "FailedAssertion", fileName=0x55d8966eddd8 "/home/andres/src/postgresql/src/backend/executor/nodeAppend.c",
lineNumber=503) at /home/andres/src/postgresql/src/backend/utils/error/assert.c:54
#3 0x000055d8961eab63 in choose_next_subplan_for_worker (node=0x55d8974535a0)
at /home/andres/src/postgresql/src/backend/executor/nodeAppend.c:503
#4 0x000055d8961ea47b in ExecAppend (pstate=0x55d8974535a0) at /home/andres/src/postgresql/src/backend/executor/nodeAppend.c:216
#5 0x000055d8961d6c55 in ExecProcNode (node=0x55d8974535a0) at /home/andres/src/postgresql/src/include/executor/executor.h:239

Greetings,

Andres Freund

#5amul sul
sulamul@gmail.com
In reply to: Tom Lane (#3)
1 attachment(s)
Re: pgsql: Support Parallel Append plan nodes.

Thanks Tom for the crash analysis, I think this is the same reason that
Rajkumar's reported case[1] was crashing only with partition-wise-join = on.
I tried to fix this crash[2] but missed the place where I have added assert
check in the assumption that we always coming from the previous check in the
while loop.

Instead, assert check we need a similar bailout logic[2] before looping back to
first partial plan. Attached patch does the same, I've tested with
parallel_leader_participation = off setting as suggested by Andres, make check
looks good except there is some obvious regression diff.

1] /messages/by-id/CAKcux6m+6nTO6C08kKaj-Waffvgvp-9SD3RCGStX=Mzk0gQU8g@mail.gmail.com
2] /messages/by-id/CAAJ_b975k58H+Ed4=p0vbJunwO2reOMX5CVB8_R=JmXxY3uW=Q@mail.gmail.com

Regards,
Amul

Show quoted text

On Wed, Dec 6, 2017 at 9:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

Robert Haas <rhaas@postgresql.org> writes:

Support Parallel Append plan nodes.

Buildfarm member sidewinder doesn't like this:

It transpires that my machine prairiedog also shows the failure.
I instrumented nodeAppend.c (as per attached patch) to see what
was going on, and I get this trace on prairiedog:

2017-12-05 22:50:30.228 EST [28812] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 0, nplans 6
2017-12-05 22:50:30.228 EST [28812] STATEMENT: select round(avg(aa)), sum(aa) from a_star a1;
2017-12-05 22:50:30.241 EST [28813] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 0, nplans 6
2017-12-05 22:50:30.241 EST [28813] STATEMENT: select round(avg(aa)), sum(aa) from a_star a1;
2017-12-05 22:50:30.252 EST [28814] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 0, nplans 6
2017-12-05 22:50:30.252 EST [28814] STATEMENT: select round(avg(aa)), sum(aa) from a_star a1;
2017-12-05 22:50:30.363 EST [28816] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan 0, fpp 2, nplans 6
2017-12-05 22:50:30.363 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.363 EST [28816] LOG: choose_next_subplan_for_worker: picking nextplan 0, fpp 2, nplans 6
2017-12-05 22:50:30.363 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.363 EST [28816] LOG: choose_next_subplan_for_worker: whichplan 0, nextplan 1, fpp 2, nplans 6
2017-12-05 22:50:30.363 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.364 EST [28816] LOG: choose_next_subplan_for_worker: picking nextplan 1, fpp 2, nplans 6
2017-12-05 22:50:30.364 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.364 EST [28816] LOG: choose_next_subplan_for_worker: whichplan 1, nextplan 2, fpp 2, nplans 6
2017-12-05 22:50:30.364 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.364 EST [28816] LOG: choose_next_subplan_for_worker: picking nextplan 2, fpp 2, nplans 6
2017-12-05 22:50:30.364 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.364 EST [28816] LOG: choose_next_subplan_for_worker: whichplan 2, nextplan 3, fpp 2, nplans 6
2017-12-05 22:50:30.364 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.364 EST [28816] LOG: choose_next_subplan_for_worker: picking nextplan 3, fpp 2, nplans 6
2017-12-05 22:50:30.364 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.365 EST [28816] LOG: choose_next_subplan_for_worker: whichplan 3, nextplan 4, fpp 2, nplans 6
2017-12-05 22:50:30.365 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.365 EST [28816] LOG: choose_next_subplan_for_worker: picking nextplan 4, fpp 2, nplans 6
2017-12-05 22:50:30.365 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.365 EST [28816] LOG: choose_next_subplan_for_worker: whichplan 4, nextplan 5, fpp 2, nplans 6
2017-12-05 22:50:30.365 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.365 EST [28816] LOG: choose_next_subplan_for_worker: picking nextplan 5, fpp 2, nplans 6
2017-12-05 22:50:30.365 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.365 EST [28816] LOG: choose_next_subplan_for_worker: whichplan 5, nextplan 2, fpp 2, nplans 6
2017-12-05 22:50:30.365 EST [28816] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.407 EST [28815] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 2, nplans 6
2017-12-05 22:50:30.407 EST [28815] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.474 EST [28817] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 2, nplans 6
2017-12-05 22:50:30.474 EST [28817] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:50:30.568 EST [28818] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan 0, fpp 6, nplans 6
2017-12-05 22:50:30.568 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.568 EST [28818] LOG: choose_next_subplan_for_worker: picking nextplan 0, fpp 6, nplans 6
2017-12-05 22:50:30.568 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.569 EST [28818] LOG: choose_next_subplan_for_worker: whichplan 0, nextplan 1, fpp 6, nplans 6
2017-12-05 22:50:30.569 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.569 EST [28818] LOG: choose_next_subplan_for_worker: picking nextplan 1, fpp 6, nplans 6
2017-12-05 22:50:30.569 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.569 EST [28818] LOG: choose_next_subplan_for_worker: whichplan 1, nextplan 2, fpp 6, nplans 6
2017-12-05 22:50:30.569 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.569 EST [28818] LOG: choose_next_subplan_for_worker: picking nextplan 2, fpp 6, nplans 6
2017-12-05 22:50:30.569 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.570 EST [28818] LOG: choose_next_subplan_for_worker: whichplan 2, nextplan 3, fpp 6, nplans 6
2017-12-05 22:50:30.570 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.570 EST [28818] LOG: choose_next_subplan_for_worker: picking nextplan 3, fpp 6, nplans 6
2017-12-05 22:50:30.570 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.570 EST [28818] LOG: choose_next_subplan_for_worker: whichplan 3, nextplan 4, fpp 6, nplans 6
2017-12-05 22:50:30.570 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.570 EST [28818] LOG: choose_next_subplan_for_worker: picking nextplan 4, fpp 6, nplans 6
2017-12-05 22:50:30.570 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.570 EST [28818] LOG: choose_next_subplan_for_worker: whichplan 4, nextplan 5, fpp 6, nplans 6
2017-12-05 22:50:30.570 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:50:30.570 EST [28818] LOG: choose_next_subplan_for_worker: picking nextplan 5, fpp 6, nplans 6
2017-12-05 22:50:30.570 EST [28818] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
TRAP: FailedAssertion("!(append->first_partial_plan < node->as_nplans)", File: "nodeAppend.c", Line: 520)
2017-12-05 22:50:38.211 EST [28158] LOG: background worker "parallel worker" (PID 28818) was terminated by signal 6: Abort trap
2017-12-05 22:50:38.211 EST [28158] DETAIL: Failed process was running: select round(avg(aa)), sum(aa) from a_star a3;

That makes it pretty clear what's going wrong, but why don't we see it
elsewhere?

On my development machine, the same patch yields

2017-12-05 22:42:24.187 EST [843] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 0, nplans 6
2017-12-05 22:42:24.187 EST [843] STATEMENT: select round(avg(aa)), sum(aa) from a_star a1;
2017-12-05 22:42:24.188 EST [844] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 0, nplans 6
2017-12-05 22:42:24.188 EST [844] STATEMENT: select round(avg(aa)), sum(aa) from a_star a1;
2017-12-05 22:42:24.188 EST [845] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 0, nplans 6
2017-12-05 22:42:24.188 EST [845] STATEMENT: select round(avg(aa)), sum(aa) from a_star a1;
2017-12-05 22:42:24.197 EST [846] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 2, nplans 6
2017-12-05 22:42:24.197 EST [846] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:42:24.200 EST [847] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 2, nplans 6
2017-12-05 22:42:24.200 EST [847] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:42:24.200 EST [848] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 2, nplans 6
2017-12-05 22:42:24.200 EST [848] STATEMENT: select round(avg(aa)), sum(aa) from a_star a2;
2017-12-05 22:42:24.214 EST [849] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 6, nplans 6
2017-12-05 22:42:24.214 EST [849] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:42:24.214 EST [852] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 6, nplans 6
2017-12-05 22:42:24.214 EST [852] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;
2017-12-05 22:42:24.214 EST [853] LOG: choose_next_subplan_for_worker: whichplan -1, nextplan -1, fpp 6, nplans 6
2017-12-05 22:42:24.214 EST [853] STATEMENT: select round(avg(aa)), sum(aa) from a_star a3;

I conclude that the reason we don't see the crash except on the slowest
buildfarm critters is that on most machines the leader process manages
to finish off all the subplans before any of the workers get as far as
doing something. This doesn't give me a warm fuzzy feeling about how
much testing this patch has gotten.

regards, tom lane

Attachments:

Parallel-Append-crash-fix.patchapplication/octet-stream; name=Parallel-Append-crash-fix.patchDownload
From 9a785314183af0339a0be88dcc9245d24e8d9818 Mon Sep 17 00:00:00 2001
From: Amul Sul <sulamul@gmail.com>
Date: Wed, 6 Dec 2017 10:22:20 +0530
Subject: [PATCH] Parallel Append crash fix

Fixes the crash on builfarm[1]

The reason behind the crash is the same that we had discussed before[2]
but this time crash is due to assert that I've added in the assumption
that we always coming  from the previous check in WHILE loop

1] https://postgr.es/m/17868.1512519318@sss.pgh.pa.us
2] https://postgr.es/m/CAAJ_b975k58H+Ed4=p0vbJunwO2reOMX5CVB8_R=JmXxY3uW=Q@mail.gmail.com
---
 src/backend/executor/nodeAppend.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 246a0b2d85..528a88b240 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -506,7 +506,14 @@ choose_next_subplan_for_worker(AppendState *node)
 	node->as_whichplan = pstate->pa_next_plan++;
 	if (pstate->pa_next_plan >= node->as_nplans)
 	{
-		Assert(append->first_partial_plan < node->as_nplans);
+		/* No partial plans then bail out. */
+		if (append->first_partial_plan >= node->as_nplans)
+		{
+			pstate->pa_next_plan = INVALID_SUBPLAN_INDEX;
+			node->as_whichplan = INVALID_SUBPLAN_INDEX;
+			LWLockRelease(&pstate->pa_lock);
+			return false;
+		}
 		pstate->pa_next_plan = append->first_partial_plan;
 	}
 
-- 
2.14.1

#6amul sul
sulamul@gmail.com
In reply to: amul sul (#5)
1 attachment(s)
Re: pgsql: Support Parallel Append plan nodes.

Copying & reverting to Amit Khandekar's email here:

On Wed, Dec 6, 2017 at 11:45 AM, amul sul <sulamul@gmail.com> wrote:

Thanks Tom for the crash analysis, I think this is the same reason that
Rajkumar's reported case[1] was crashing only with partition-wise-join = on.
I tried to fix this crash[2] but missed the place where I have added assert
check in the assumption that we always coming from the previous check in the
while loop.

Instead, assert check we need a similar bailout logic[2] before looping back to
first partial plan. Attached patch does the same, I've tested with
parallel_leader_participation = off setting as suggested by Andres, make check
looks good except there is some obvious regression diff.

1] /messages/by-id/CAKcux6m+6nTO6C08kKaj-Waffvgvp-9SD3RCGStX=Mzk0gQU8g@mail.gmail.com
2] /messages/by-id/CAAJ_b975k58H+Ed4=p0vbJunwO2reOMX5CVB8_R=JmXxY3uW=Q@mail.gmail.com

@@ -506,7 +506,14 @@ choose_next_subplan_for_worker(AppendState *node)
node->as_whichplan = pstate->pa_next_plan++;
if (pstate->pa_next_plan >= node->as_nplans)
{
- Assert(append->first_partial_plan < node->as_nplans);
+ /* No partial plans then bail out. */
+ if (append->first_partial_plan >= node->as_nplans)
+ {
+ pstate->pa_next_plan = INVALID_SUBPLAN_INDEX;
+ node->as_whichplan = INVALID_SUBPLAN_INDEX;
+ LWLockRelease(&pstate->pa_lock);
+ return false;
+ }
pstate->pa_next_plan = append->first_partial_plan;

In the above code, the fact that we have not bailed out from the
earlier for loop means that we have already found an unfinished plan
and node->as_whichplan is set to that plan. So while setting the next
plan above for the other workers to pick, we should not return false,
nor should we set node->as_whichplan to INVALID_SUBPLAN_INDEX.
Instead, just set pa_next_plan to INVALID_SUBPLAN_INDEX. Otherwise,
the chosen plan won't get executed at all.

Understood, thanks for the review. Updated patch attached.

1] /messages/by-id/CAJ3gD9e3_D3fFqzWBFYoaF-6yCXgbOFn3Mb-pgd_mxvjpsw7Rw@mail.gmail.com

Regards,
Amul

Attachments:

Parallel-Append-crash-fix-v2.patchapplication/octet-stream; name=Parallel-Append-crash-fix-v2.patchDownload
From 800996d580e7eb3395efeb9c95d6c077866e84b8 Mon Sep 17 00:00:00 2001
From: Amul Sul <sulamul@gmail.com>
Date: Wed, 6 Dec 2017 14:28:27 +0530
Subject: [PATCH] Parallel Append crash fix v2

v2: Update w.r.t Amit Khandekar's review comments[3]:
 - Setting pstate->pa_next_plan to invalid index, not to bail out

v1:
Fixes the crash on builfarm[1]

The reason behind the crash is the same that we had discussed before[2]
but this time crash is due to assert that I've added in the assumption
that we always coming  from the previous check in WHILE loop

--------------
References:
--------------
1] https://postgr.es/m/17868.1512519318@sss.pgh.pa.us
2] https://postgr.es/m/CAAJ_b975k58H+Ed4=p0vbJunwO2reOMX5CVB8_R=JmXxY3uW=Q@mail.gmail.com
3] https://postgr.es/m/CAJ3gD9e3_D3fFqzWBFYoaF-6yCXgbOFn3Mb-pgd_mxvjpsw7Rw@mail.gmail.com
---
 src/backend/executor/nodeAppend.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 246a0b2d85..96d9a7aed5 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -506,8 +506,10 @@ choose_next_subplan_for_worker(AppendState *node)
 	node->as_whichplan = pstate->pa_next_plan++;
 	if (pstate->pa_next_plan >= node->as_nplans)
 	{
-		Assert(append->first_partial_plan < node->as_nplans);
-		pstate->pa_next_plan = append->first_partial_plan;
+		if (append->first_partial_plan < node->as_nplans)
+			pstate->pa_next_plan = append->first_partial_plan;
+		else
+			pstate->pa_next_plan = INVALID_SUBPLAN_INDEX;
 	}
 
 	/* If non-partial, immediately mark as finished. */
-- 
2.14.1

#7Amit Khandekar
amitdkhan.pg@gmail.com
In reply to: amul sul (#6)
1 attachment(s)
Re: pgsql: Support Parallel Append plan nodes.

On 6 December 2017 at 14:31, amul sul <sulamul@gmail.com> wrote:

Copying & reverting to Amit Khandekar's email here:

On Wed, Dec 6, 2017 at 11:45 AM, amul sul <sulamul@gmail.com> wrote:

Thanks Tom for the crash analysis, I think this is the same reason that
Rajkumar's reported case[1] was crashing only with partition-wise-join = on.
I tried to fix this crash[2] but missed the place where I have added assert
check in the assumption that we always coming from the previous check in the
while loop.

Instead, assert check we need a similar bailout logic[2] before looping back to
first partial plan. Attached patch does the same, I've tested with
parallel_leader_participation = off setting as suggested by Andres, make check
looks good except there is some obvious regression diff.

1] /messages/by-id/CAKcux6m+6nTO6C08kKaj-Waffvgvp-9SD3RCGStX=Mzk0gQU8g@mail.gmail.com
2] /messages/by-id/CAAJ_b975k58H+Ed4=p0vbJunwO2reOMX5CVB8_R=JmXxY3uW=Q@mail.gmail.com

@@ -506,7 +506,14 @@ choose_next_subplan_for_worker(AppendState *node)
node->as_whichplan = pstate->pa_next_plan++;
if (pstate->pa_next_plan >= node->as_nplans)
{
- Assert(append->first_partial_plan < node->as_nplans);
+ /* No partial plans then bail out. */
+ if (append->first_partial_plan >= node->as_nplans)
+ {
+ pstate->pa_next_plan = INVALID_SUBPLAN_INDEX;
+ node->as_whichplan = INVALID_SUBPLAN_INDEX;
+ LWLockRelease(&pstate->pa_lock);
+ return false;
+ }
pstate->pa_next_plan = append->first_partial_plan;

In the above code, the fact that we have not bailed out from the
earlier for loop means that we have already found an unfinished plan
and node->as_whichplan is set to that plan. So while setting the next
plan above for the other workers to pick, we should not return false,
nor should we set node->as_whichplan to INVALID_SUBPLAN_INDEX.
Instead, just set pa_next_plan to INVALID_SUBPLAN_INDEX. Otherwise,
the chosen plan won't get executed at all.

Understood, thanks for the review. Updated patch attached.

1] /messages/by-id/CAJ3gD9e3_D3fFqzWBFYoaF-6yCXgbOFn3Mb-pgd_mxvjpsw7Rw@mail.gmail.com

This looks good.

In attached revised patch, just added some comments in the changes that you did.

Regards,
Amul

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Attachments:

Parallel-Append-crash-fix-v3.patchapplication/octet-stream; name=Parallel-Append-crash-fix-v3.patchDownload
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 246a0b2..0e93713 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -506,8 +506,16 @@ choose_next_subplan_for_worker(AppendState *node)
 	node->as_whichplan = pstate->pa_next_plan++;
 	if (pstate->pa_next_plan >= node->as_nplans)
 	{
-		Assert(append->first_partial_plan < node->as_nplans);
-		pstate->pa_next_plan = append->first_partial_plan;
+		if (append->first_partial_plan < node->as_nplans)
+			pstate->pa_next_plan = append->first_partial_plan;
+		else
+		{
+			/*
+			 * We have only non-partial plans, and we already chose the last
+			 * one; so arrange for the other workers to immediately bail out.
+			 */
+			pstate->pa_next_plan = INVALID_SUBPLAN_INDEX;
+		}
 	}
 
 	/* If non-partial, immediately mark as finished. */
#8Robert Haas
robertmhaas@gmail.com
In reply to: Amit Khandekar (#7)
Re: pgsql: Support Parallel Append plan nodes.

On Wed, Dec 6, 2017 at 5:01 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:

This looks good.

In attached revised patch, just added some comments in the changes that you did.

Committed, thanks. It's rather embarrassing that I didn't notice this
problem, because I did compare that logic with the preceding loop. I
concluded it was OK on the theory the previous loop would have already
given up if there were no partial plans. But that's wrong, of course:
the previous loop will not have given up if it grabbed the last plan
in a list of only non-partial plans. Oops.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#8)
Re: pgsql: Support Parallel Append plan nodes.

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Dec 6, 2017 at 5:01 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:

In attached revised patch, just added some comments in the changes that you did.

Committed, thanks.

While this patch (looks like it) fixes the logic error, it does nothing
for the problem that the committed test cases don't actually exercise
any of this code on most machines -- certainly not whichever one is
producing the code coverage report:
https://coverage.postgresql.org/src/backend/executor/nodeAppend.c.gcov.html

Can we do something with Andres' suggestion of running these test cases
under parallel_leader_participation = off?

regards, tom lane

#10Amit Khandekar
amitdkhan.pg@gmail.com
In reply to: Tom Lane (#9)
Re: pgsql: Support Parallel Append plan nodes.

On 6 December 2017 at 20:59, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Dec 6, 2017 at 5:01 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:

In attached revised patch, just added some comments in the changes that you did.

Committed, thanks.

While this patch (looks like it) fixes the logic error, it does nothing
for the problem that the committed test cases don't actually exercise
any of this code on most machines -- certainly not whichever one is
producing the code coverage report:
https://coverage.postgresql.org/src/backend/executor/nodeAppend.c.gcov.html

Can we do something with Andres' suggestion of running these test cases
under parallel_leader_participation = off?

regards, tom lane

Yes, I am planning to send a patch that makes all those
Parallel-Append test cases run once with parallel_leader_participation
"on" and then run again with the guc "off" . Thanks.

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

#11Amit Khandekar
amitdkhan.pg@gmail.com
In reply to: Amit Khandekar (#10)
1 attachment(s)
Re: pgsql: Support Parallel Append plan nodes.

On 7 December 2017 at 12:32, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:

On 6 December 2017 at 20:59, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Dec 6, 2017 at 5:01 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:

In attached revised patch, just added some comments in the changes that you did.

Committed, thanks.

While this patch (looks like it) fixes the logic error, it does nothing
for the problem that the committed test cases don't actually exercise
any of this code on most machines -- certainly not whichever one is
producing the code coverage report:
https://coverage.postgresql.org/src/backend/executor/nodeAppend.c.gcov.html

Can we do something with Andres' suggestion of running these test cases
under parallel_leader_participation = off?

regards, tom lane

Yes, I am planning to send a patch that makes all those
Parallel-Append test cases run once with parallel_leader_participation
"on" and then run again with the guc "off" . Thanks.

Attached is a patch that adds the above test cases. This allows
coverage for the function choose_next_subplan_for_worker().

The code to advance pa_next_plan is there in the for-loop (place_1)
and again below the for loop (place_2). At both these places, the
logic involves wrapping around to the first (partial) plan. The code
coverage exists for this wrap-around logic at place_2, but not at
place_1 (i.e. in the for loop) :

470 : /* If all the plans are already done, we have nothing to do */
471 72 : if (pstate->pa_next_plan == INVALID_SUBPLAN_INDEX)
472 : {
473 32 : LWLockRelease(&pstate->pa_lock);
474 32 : return false;
475 : }
476 :
477 : /* Loop until we find a subplan to execute. */
478 92 : while (pstate->pa_finished[pstate->pa_next_plan])
479 : {
480 16 : if (pstate->pa_next_plan < node->as_nplans - 1)
481 : {
482 : /* Advance to next plan. */
483 16 : pstate->pa_next_plan++;
484 : }
485 0 : else if (append->first_partial_plan < node->as_nplans)
486 : {
487 : /* Loop back to first partial plan. */
488 0 : pstate->pa_next_plan = append->first_partial_plan;
489 : }
490 : else
491 : {
492 : /* At last plan, no partial plans, arrange to bail out. */
493 0 : pstate->pa_next_plan = node->as_whichplan;
494 : }
495 :
496 16 : if (pstate->pa_next_plan == node->as_whichplan)
497 : {
498 : /* We've tried everything! */
499 4 : pstate->pa_next_plan = INVALID_SUBPLAN_INDEX;
500 4 : LWLockRelease(&pstate->pa_lock);
501 4 : return false;
502 : }
503 : }

I have not managed to make the regression test cases execute the above
not-covered case. I could do that with my local test that I have, but
that test has lots of data, so it is slow, and not suitable for
regression. In select_parallel.sql, by the time a worker w1 gets into
the function choose_next_subplan_for_worker(), an earlier worker w2
must have already wrapped around the pa_next_plan at place_2, so this
worker doesn't get a chance to hit place_1. It's a timing issue. I
will see if I can solve this.

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Attachments:

test_with_noleader.patchapplication/octet-stream; name=test_with_noleader.patchDownload
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index ff00d47..ff3e083 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -35,6 +35,40 @@ select round(avg(aa)), sum(aa) from a_star a1;
     14 | 355
 (1 row)
 
+-- leader participation disabled.
+set parallel_leader_participation = off;
+-- Note : With no leader, subplans become non-partial, unless parallel_workers
+-- is set for each of them.
+alter table c_star set (parallel_workers = 3);
+alter table d_star set (parallel_workers = 3);
+alter table a_star set (parallel_workers = 3);
+alter table b_star set (parallel_workers = 3);
+alter table e_star set (parallel_workers = 3);
+alter table f_star set (parallel_workers = 3);
+explain (costs off)
+  select round(avg(aa)), sum(aa) from a_star;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Finalize Aggregate
+   ->  Gather
+         Workers Planned: 3
+         ->  Partial Aggregate
+               ->  Parallel Append
+                     ->  Parallel Seq Scan on a_star
+                     ->  Parallel Seq Scan on b_star
+                     ->  Parallel Seq Scan on c_star
+                     ->  Parallel Seq Scan on d_star
+                     ->  Parallel Seq Scan on e_star
+                     ->  Parallel Seq Scan on f_star
+(11 rows)
+
+select round(avg(aa)), sum(aa) from a_star a1_1;
+ round | sum 
+-------+-----
+    14 | 355
+(1 row)
+
+reset parallel_leader_participation;
 -- Parallel Append with both partial and non-partial subplans
 alter table c_star set (parallel_workers = 0);
 alter table d_star set (parallel_workers = 0);
@@ -61,6 +95,32 @@ select round(avg(aa)), sum(aa) from a_star a2;
     14 | 355
 (1 row)
 
+-- leader participation disabled.
+set parallel_leader_participation = off;
+explain (costs off)
+  select round(avg(aa)), sum(aa) from a_star;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Finalize Aggregate
+   ->  Gather
+         Workers Planned: 3
+         ->  Partial Aggregate
+               ->  Parallel Append
+                     ->  Seq Scan on d_star
+                     ->  Seq Scan on c_star
+                     ->  Parallel Seq Scan on a_star
+                     ->  Parallel Seq Scan on b_star
+                     ->  Parallel Seq Scan on e_star
+                     ->  Parallel Seq Scan on f_star
+(11 rows)
+
+select round(avg(aa)), sum(aa) from a_star a2_1;
+ round | sum 
+-------+-----
+    14 | 355
+(1 row)
+
+reset parallel_leader_participation;
 -- Parallel Append with only non-partial subplans
 alter table a_star set (parallel_workers = 0);
 alter table b_star set (parallel_workers = 0);
@@ -89,6 +149,32 @@ select round(avg(aa)), sum(aa) from a_star a3;
     14 | 355
 (1 row)
 
+-- leader participation disabled.
+set parallel_leader_participation = off;
+explain (costs off)
+  select round(avg(aa)), sum(aa) from a_star;
+                 QUERY PLAN                 
+--------------------------------------------
+ Finalize Aggregate
+   ->  Gather
+         Workers Planned: 3
+         ->  Partial Aggregate
+               ->  Parallel Append
+                     ->  Seq Scan on d_star
+                     ->  Seq Scan on f_star
+                     ->  Seq Scan on e_star
+                     ->  Seq Scan on b_star
+                     ->  Seq Scan on c_star
+                     ->  Seq Scan on a_star
+(11 rows)
+
+select round(avg(aa)), sum(aa) from a_star a3_1;
+ round | sum 
+-------+-----
+    14 | 355
+(1 row)
+
+reset parallel_leader_participation;
 -- Disable Parallel Append
 alter table a_star reset (parallel_workers);
 alter table b_star reset (parallel_workers);
@@ -120,6 +206,29 @@ select round(avg(aa)), sum(aa) from a_star a4;
     14 | 355
 (1 row)
 
+-- leader participation disabled.
+set parallel_leader_participation = off;
+explain (costs off)
+  select round(avg(aa)), sum(aa) from a_star;
+           QUERY PLAN           
+--------------------------------
+ Aggregate
+   ->  Append
+         ->  Seq Scan on a_star
+         ->  Seq Scan on b_star
+         ->  Seq Scan on c_star
+         ->  Seq Scan on d_star
+         ->  Seq Scan on e_star
+         ->  Seq Scan on f_star
+(8 rows)
+
+select round(avg(aa)), sum(aa) from a_star a4_1;
+ round | sum 
+-------+-----
+    14 | 355
+(1 row)
+
+reset parallel_leader_participation;
 reset enable_parallel_append;
 -- test with leader participation disabled
 set parallel_leader_participation = off;
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index 1035d04..a79422f 100644
--- a/src/test/regress/sql/select_parallel.sql
+++ b/src/test/regress/sql/select_parallel.sql
@@ -19,6 +19,20 @@ set max_parallel_workers_per_gather=4;
 explain (costs off)
   select round(avg(aa)), sum(aa) from a_star;
 select round(avg(aa)), sum(aa) from a_star a1;
+-- leader participation disabled.
+set parallel_leader_participation = off;
+-- Note : With no leader, subplans become non-partial, unless parallel_workers
+-- is set for each of them.
+alter table c_star set (parallel_workers = 3);
+alter table d_star set (parallel_workers = 3);
+alter table a_star set (parallel_workers = 3);
+alter table b_star set (parallel_workers = 3);
+alter table e_star set (parallel_workers = 3);
+alter table f_star set (parallel_workers = 3);
+explain (costs off)
+  select round(avg(aa)), sum(aa) from a_star;
+select round(avg(aa)), sum(aa) from a_star a1_1;
+reset parallel_leader_participation;
 
 -- Parallel Append with both partial and non-partial subplans
 alter table c_star set (parallel_workers = 0);
@@ -26,6 +40,13 @@ alter table d_star set (parallel_workers = 0);
 explain (costs off)
   select round(avg(aa)), sum(aa) from a_star;
 select round(avg(aa)), sum(aa) from a_star a2;
+-- leader participation disabled.
+set parallel_leader_participation = off;
+explain (costs off)
+  select round(avg(aa)), sum(aa) from a_star;
+select round(avg(aa)), sum(aa) from a_star a2_1;
+reset parallel_leader_participation;
+
 
 -- Parallel Append with only non-partial subplans
 alter table a_star set (parallel_workers = 0);
@@ -35,6 +56,12 @@ alter table f_star set (parallel_workers = 0);
 explain (costs off)
   select round(avg(aa)), sum(aa) from a_star;
 select round(avg(aa)), sum(aa) from a_star a3;
+-- leader participation disabled.
+set parallel_leader_participation = off;
+explain (costs off)
+  select round(avg(aa)), sum(aa) from a_star;
+select round(avg(aa)), sum(aa) from a_star a3_1;
+reset parallel_leader_participation;
 
 -- Disable Parallel Append
 alter table a_star reset (parallel_workers);
@@ -47,6 +74,12 @@ set enable_parallel_append to off;
 explain (costs off)
   select round(avg(aa)), sum(aa) from a_star;
 select round(avg(aa)), sum(aa) from a_star a4;
+-- leader participation disabled.
+set parallel_leader_participation = off;
+explain (costs off)
+  select round(avg(aa)), sum(aa) from a_star;
+select round(avg(aa)), sum(aa) from a_star a4_1;
+reset parallel_leader_participation;
 reset enable_parallel_append;
 
 -- test with leader participation disabled
#12Robert Haas
robertmhaas@gmail.com
In reply to: Amit Khandekar (#11)
Re: pgsql: Support Parallel Append plan nodes.

On Thu, Dec 7, 2017 at 6:01 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:

I have not managed to make the regression test cases execute the above
not-covered case. I could do that with my local test that I have, but
that test has lots of data, so it is slow, and not suitable for
regression. In select_parallel.sql, by the time a worker w1 gets into
the function choose_next_subplan_for_worker(), an earlier worker w2
must have already wrapped around the pa_next_plan at place_2, so this
worker doesn't get a chance to hit place_1. It's a timing issue. I
will see if I can solve this.

Well, it seems to me that just running the test case with
parallel_leader_participation = off might be good enough. It would
hit a decent chunk of this logic, and that's better than what we have
today. If we're determined to hit the wraparound case the way to do
it is probably to make the first child of the append something that
takes a long time to execute and the second one something quick. But,
as you say, it's hard to do such things in the regression tests
without making them slow, and I have observed Tom to dislike things
that make the regression tests slow.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company