[sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

Started by Andreas Seltenreichover 9 years ago24 messages
#1Andreas Seltenreich
seltenreich@gmx.de
1 attachment(s)

Hi,

when fuzz testing master as of c1543a8, parallel workers trigger the
following assertion in ExecInitSubPlan every couple hours.

TRAP: FailedAssertion("!(list != ((List *) ((void *)0)))", File: "list.c", Line: 390)

Sample backtraces of a worker and leader below, plan of leader attached.
The collected queries don't seem to reproduce it. Curiously, running
explain on them on the failed instance after crash recovery never shows
any gather nodes…

regards,
andreas

Core was generated by `postgres: bgworker: parallel worker for PID 28062 '.
Program terminated with signal SIGABRT, Aborted.
(gdb) bt
#3 0x000000000061bad2 in list_nth_cell (list=0x0, n=<optimized out>) at list.c:390
#4 0x000000000061bb26 in list_nth (list=<optimized out>, n=<optimized out>) at list.c:413
#5 0x00000000005fe566 in ExecInitSubPlan (subplan=subplan@entry=0x1522a08, parent=parent@entry=0x1538188) at nodeSubplan.c:705
#6 0x00000000005e3b54 in ExecInitExpr (node=0x1522a08, parent=parent@entry=0x1538188) at execQual.c:4724
#7 0x00000000005e415c in ExecInitExpr (node=<optimized out>, parent=parent@entry=0x1538188) at execQual.c:5164
#8 0x00000000005ff3fc in ExecInitAlternativeSubPlan (asplan=asplan@entry=0x1522060, parent=parent@entry=0x1538188) at nodeSubplan.c:1185
#9 0x00000000005e35c4 in ExecInitExpr (node=0x1522060, parent=parent@entry=0x1538188) at execQual.c:4740
#10 0x00000000005e3f8a in ExecInitExpr (node=0x1522978, parent=parent@entry=0x1538188) at execQual.c:4845
#11 0x00000000005e415c in ExecInitExpr (node=<optimized out>, parent=parent@entry=0x1538188) at execQual.c:5164
#12 0x00000000005e3687 in ExecInitExpr (node=0x1522920, parent=parent@entry=0x1538188) at execQual.c:4648
#13 0x00000000005e330f in ExecInitExpr (node=0x15228c8, parent=parent@entry=0x1538188) at execQual.c:5132
#14 0x00000000005e326f in ExecInitExpr (node=0x1522870, parent=parent@entry=0x1538188) at execQual.c:5152
#15 0x00000000005e415c in ExecInitExpr (node=<optimized out>, parent=parent@entry=0x1538188) at execQual.c:5164
#16 0x00000000005fbb62 in ExecInitSeqScan (node=0x1522728, estate=0x15379b8, eflags=16) at nodeSeqscan.c:192
#17 0x00000000005dd567 in ExecInitNode (node=0x1522728, estate=estate@entry=0x15379b8, eflags=eflags@entry=16) at execProcnode.c:192
#18 0x00000000005f12a5 in ExecInitHashJoin (node=0x1522530, estate=0x15379b8, eflags=16) at nodeHashjoin.c:489
#19 0x00000000005dd497 in ExecInitNode (node=node@entry=0x1522530, estate=estate@entry=0x15379b8, eflags=eflags@entry=16) at execProcnode.c:275
#20 0x00000000005dae6c in InitPlan (eflags=16, queryDesc=<optimized out>) at execMain.c:959
#21 standard_ExecutorStart (queryDesc=<optimized out>, eflags=16) at execMain.c:238
#22 0x00000000005dcac4 in ParallelQueryMain (seg=<optimized out>, toc=0x7f442d27b000) at execParallel.c:729
#23 0x00000000004e631b in ParallelWorkerMain (main_arg=<optimized out>) at parallel.c:1033
#24 0x0000000000683af2 in StartBackgroundWorker () at bgworker.c:726
#25 0x000000000068ec32 in do_start_bgworker (rw=0x14c4a20) at postmaster.c:5531
#26 maybe_start_bgworker () at postmaster.c:5706
#27 0x000000000068f686 in sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster.c:4967
#28 <signal handler called>
#29 0x00007f442c839ac3 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:81
#30 0x000000000046c144 in ServerLoop () at postmaster.c:1654
#31 0x0000000000690aae in PostmasterMain (argc=argc@entry=4, argv=argv@entry=0x14a1560) at postmaster.c:1298
#32 0x000000000046d78d in main (argc=4, argv=0x14a1560) at main.c:228
(gdb) frame 5
#5 0x00000000005fe566 in ExecInitSubPlan (subplan=subplan@entry=0x1522a08, parent=parent@entry=0x1538188) at nodeSubplan.c:705
(gdb) list
704 /* Link the SubPlanState to already-initialized subplan */
705 sstate->planstate = (PlanState *) list_nth(estate->es_subplanstates,
706 subplan->plan_id - 1);

(gdb) attach 28062
Attaching to program: /home/smith/postgres/inst/master/bin/postgres, process 28062
(gdb) bt
#0 0x00007f442c840e33 in __epoll_wait_nocancel () at ../sysdeps/unix/syscall-template.S:81
#1 0x00000000006d1dde in WaitEventSetWaitBlock (nevents=1, occurred_events=0x7fffdedd75a0, cur_timeout=-1, set=0xe3eedb8) at latch.c:981
#2 WaitEventSetWait (set=set@entry=0xe3eedb8, timeout=timeout@entry=-1, occurred_events=occurred_events@entry=0x7fffdedd75a0, nevents=nevents@entry=1) at latch.c:935
#3 0x00000000006d2226 in WaitLatchOrSocket (latch=0x7f442c0d9644, wakeEvents=wakeEvents@entry=1, sock=sock@entry=-1, timeout=-1, timeout@entry=0) at latch.c:347
#4 0x00000000006d22ed in WaitLatch (latch=<optimized out>, wakeEvents=wakeEvents@entry=1, timeout=timeout@entry=0) at latch.c:302
#5 0x00000000005ef4e3 in gather_readnext (gatherstate=0xe3d7ce8) at nodeGather.c:384
#6 gather_getnext (gatherstate=0xe3d7ce8) at nodeGather.c:283
#7 ExecGather (node=node@entry=0xe3d7ce8) at nodeGather.c:229
#8 0x00000000005dd728 in ExecProcNode (node=node@entry=0xe3d7ce8) at execProcnode.c:515
#9 0x00000000005f995c in ExecNestLoop (node=node@entry=0x10ef9d90) at nodeNestloop.c:174
#10 0x00000000005dd7b8 in ExecProcNode (node=node@entry=0x10ef9d90) at execProcnode.c:476
#11 0x00000000005f4990 in ExecLimit (node=node@entry=0x10ef9ac0) at nodeLimit.c:91
#12 0x00000000005dd6b8 in ExecProcNode (node=node@entry=0x10ef9ac0) at execProcnode.c:531
#13 0x00000000005d993f in ExecutePlan (dest=0x5babba0, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x10ef9ac0, estate=0x368b6e8) at execMain.c:1567
#14 standard_ExecutorRun (queryDesc=0x52a89f8, direction=<optimized out>, count=0) at execMain.c:338
#15 0x00000000006f7118 in PortalRunSelect (portal=portal@entry=0x3e84a10, forward=forward@entry=1 '\001', count=0, count@entry=9223372036854775807, dest=dest@entry=0x5babba0) at pquery.c:946
#16 0x00000000006f863e in PortalRun (portal=0x3e84a10, count=9223372036854775807, isTopLevel=<optimized out>, dest=0x5babba0, altdest=0x5babba0, completionTag=0x7fffdedd7b30 "") at pquery.c:787
#17 0x00000000006f5ef3 in exec_simple_query (query_string=<optimized out>) at postgres.c:1094
#18 PostgresMain (argc=65554960, argv=0xa38beb0, dbname=0x14a2578 "regression", username=0xa38c088 "\260\276\070\n") at postgres.c:4059
#19 0x000000000046c82a in BackendRun (port=0x14c75b0) at postmaster.c:4258
#20 BackendStartup (port=0x14c75b0) at postmaster.c:3932
#21 ServerLoop () at postmaster.c:1690
#22 0x0000000000690aae in PostmasterMain (argc=argc@entry=4, argv=argv@entry=0x14a1560) at postmaster.c:1298
#23 0x000000000046d78d in main (argc=4, argv=0x14a1560) at main.c:228

(gdb) p debug_query_string
select ref_68.is_trigger_deletable as c0, (select d from
public.renamecolumnchild limit 1 offset 15) as c1,
ref_68.is_trigger_insertable_into as c2, 3 as c3 from
information_schema.role_udt_grants as ref_67 left join
information_schema.views as ref_68 on (ref_67.grantee =
ref_68.table_catalog ) where true limit 138;

Attachments:

plan.txt.bz2application/octet-streamDownload
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andreas Seltenreich (#1)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

Andreas Seltenreich <seltenreich@gmx.de> writes:

when fuzz testing master as of c1543a8, parallel workers trigger the
following assertion in ExecInitSubPlan every couple hours.
TRAP: FailedAssertion("!(list != ((List *) ((void *)0)))", File: "list.c", Line: 390)
Sample backtraces of a worker and leader below, plan of leader attached.
The collected queries don't seem to reproduce it.

Odd. My understanding of the restrictions on parallel query is that
anything involving a SubPlan ought not be parallelized; and this query
doesn't seem to get parallelized when I try it. It seems like what you're
seeing is that sometimes that restriction fails to be checked, but how
could that happen?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Amit Kapila
amit.kapila16@gmail.com
In reply to: Tom Lane (#2)
1 attachment(s)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

On Fri, May 6, 2016 at 8:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andreas Seltenreich <seltenreich@gmx.de> writes:

when fuzz testing master as of c1543a8, parallel workers trigger the
following assertion in ExecInitSubPlan every couple hours.
TRAP: FailedAssertion("!(list != ((List *) ((void *)0)))", File:

"list.c", Line: 390)

Sample backtraces of a worker and leader below, plan of leader attached.
The collected queries don't seem to reproduce it.

Odd. My understanding of the restrictions on parallel query is that
anything involving a SubPlan ought not be parallelized;

Subplan references are considered parallel-restricted, so parallel plan can
be generated if there are subplans in a query, but they shouldn't be pushed
to workers. I have tried a somewhat simpler example to see if we pushdown
anything parallel restricted to worker in case of joins and it turned out
there are cases when that can happen. Consider below example:

create or replace function parallel_func_select() returns integer
as $$
declare
ret_val int;
begin
ret_val := 1000;
return ret_val;
end;
$$ language plpgsql Parallel Restricted;

CREATE TABLE t1(c1, c2) AS SELECT g, repeat('x', 5) FROM
generate_series(1, 10000000) g;

CREATE TABLE t2(c1, c2) AS SELECT g, repeat('x', 5) FROM
generate_series(1, 1000000) g;

Explain Verbose SELECT t1.c1 + parallel_func_select(), t2.c1 FROM t1 JOIN
t2 ON t1.c1 = t2.c1;

QUERY PLAN

--------------------------------------------------------------------------------
--------
Gather (cost=32813.00..537284.53 rows=1000000 width=8)
Output: ((t1.c1 + parallel_func_select())), t2.c1
Workers Planned: 2
-> Hash Join (cost=31813.00..436284.53 rows=1000000 width=8)
Output: (t1.c1 + parallel_func_select()), t2.c1
Hash Cond: (t1.c1 = t2.c1)
-> Parallel Seq Scan on public.t1 (cost=0.00..95721.08
rows=4166608 w
idth=4)
Output: t1.c1, t1.c2
-> Hash (cost=15406.00..15406.00 rows=1000000 width=4)
Output: t2.c1
-> Seq Scan on public.t2 (cost=0.00..15406.00 rows=1000000
widt
h=4)
Output: t2.c1
(12 rows)

From the above output it is clear that parallel restricted function is
pushed down below gather node. I found that though we have have care fully
avoided to push pathtarget below GatherPath in apply_projection_to_path()
if pathtarget contains any parallel unsafe or parallel restricted clause,
but we are separately also trying to apply pathtarget to partialpath list
which doesn't seem to be the correct way even if it is required. I think
this has been added during parallel aggregate patch and it seems to me this
is not required after the changes related to GatherPath in
apply_projection_to_path().

After applying the attached patch, it avoids to add parallel restricted
clauses below gather path.

Now back to the original bug, if you notice in plan file attached in
original bug report, the subplan is pushed below Gather node in target
list, but not to immediate join, rather at one more level down to SeqScan
path. I am still not sure how it has manage to push the restricted clauses
to that down the level.

Thoughts?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

avoid_restricted_clause_below_gather_v1.patchapplication/octet-stream; name=avoid_restricted_clause_below_gather_v1.patchDownload
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 6770836..c10291b 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1748,19 +1748,6 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 		}
 
 		/*
-		 * Likewise for any partial paths, although this case is simpler, since
-		 * we don't track the cheapest path.
-		 */
-		foreach(lc, current_rel->partial_pathlist)
-		{
-			Path	   *subpath = (Path *) lfirst(lc);
-
-			Assert(subpath->param_info == NULL);
-			lfirst(lc) = apply_projection_to_path(root, current_rel,
-											subpath, scanjoin_target);
-		}
-
-		/*
 		 * Save the various upper-rel PathTargets we just computed into
 		 * root->upper_targets[].  The core code doesn't use this, but it
 		 * provides a convenient place for extensions to get at the info.  For
#4Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#3)
1 attachment(s)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

On Sat, May 7, 2016 at 6:37 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, May 6, 2016 at 8:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andreas Seltenreich <seltenreich@gmx.de> writes:

when fuzz testing master as of c1543a8, parallel workers trigger the
following assertion in ExecInitSubPlan every couple hours.
TRAP: FailedAssertion("!(list != ((List *) ((void *)0)))", File:

"list.c", Line: 390)

Sample backtraces of a worker and leader below, plan of leader

attached.

The collected queries don't seem to reproduce it.

Odd. My understanding of the restrictions on parallel query is that
anything involving a SubPlan ought not be parallelized;

Subplan references are considered parallel-restricted, so parallel plan

can be generated if there are subplans in a query, but they shouldn't be
pushed to workers. I have tried a somewhat simpler example to see if we
pushdown anything parallel restricted to worker in case of joins and it
turned out there are cases when that can happen. Consider below example:

From the above output it is clear that parallel restricted function is

pushed down below gather node. I found that though we have have care fully
avoided to push pathtarget below GatherPath in apply_projection_to_path()
if pathtarget contains any parallel unsafe or parallel restricted clause,
but we are separately also trying to apply pathtarget to partialpath list
which doesn't seem to be the correct way even if it is required. I think
this has been added during parallel aggregate patch and it seems to me this
is not required after the changes related to GatherPath in
apply_projection_to_path().

After applying the attached patch, it avoids to add parallel restricted

clauses below gather path.

Now back to the original bug, if you notice in plan file attached in

original bug report, the subplan is pushed below Gather node in target
list, but not to immediate join, rather at one more level down to SeqScan
path. I am still not sure how it has manage to push the restricted clauses
to that down the level.

On further analysis, I think I know what is going on in the original bug
report. We add the Vars (build_base_rel_tlists) and PlaceholderVars
(add_placeholders_to_base_rels()) to each relations (RelOptInfo) target
during qurey_planner and the Subplans are added as PlaceHolderVars in
target expressions. Now while considering whether a particular rel can be
parallel in set_rel_consider_parallel(), we don't check the target
expressions to allow the relation for parallelism. I think we can prohibit
the relation to be considered for parallelism if it's target expressions
contain any parallel restricted clause. Fix on those lines is attached
with this mail.

Thanks to Dilip Kumar for helping me in narrowing down this particular
problem. We were not able to generate the exact test, but I think the
above theory is sufficient to prove that it can cause a problem as seen in
the original bug report.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

prohibit_parallel_clause_below_rel_v1.patchapplication/octet-stream; name=prohibit_parallel_clause_below_rel_v1.patchDownload
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 873a764..585a8db 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -598,14 +598,15 @@ set_rel_consider_parallel(PlannerInfo *root, RelOptInfo *rel,
 	}
 
 	/*
-	 * If there's anything in baserestrictinfo that's parallel-restricted,
-	 * we give up on parallelizing access to this relation.  We could consider
-	 * instead postponing application of the restricted quals until we're
-	 * above all the parallelism in the plan tree, but it's not clear that
-	 * this would be a win in very many cases, and it might be tricky to make
-	 * outer join clauses work correctly.
+	 * If there's anything in baserestrictinfo or reltarget that's
+	 * parallel-restricted, we give up on parallelizing access to this
+	 * relation.  We could consider instead postponing application of the
+	 * restricted clauses until we're above all the parallelism in the plan
+	 * tree, but it's not clear that this would be a win in very many cases.
+	 * It might be tricky to make outer join clauses work correctly.
 	 */
-	if (has_parallel_hazard((Node *) rel->baserestrictinfo, false))
+	if (has_parallel_hazard((Node *) rel->baserestrictinfo, false) ||
+		has_parallel_hazard((Node *) rel->reltarget->exprs, false))
 		return;
 
 	/* We have a winner. */
#5Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#4)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

On Thu, May 12, 2016 at 7:46 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On further analysis, I think I know what is going on in the original bug
report. We add the Vars (build_base_rel_tlists) and PlaceholderVars
(add_placeholders_to_base_rels()) to each relations (RelOptInfo) target
during qurey_planner and the Subplans are added as PlaceHolderVars in target
expressions. Now while considering whether a particular rel can be parallel
in set_rel_consider_parallel(), we don't check the target expressions to
allow the relation for parallelism. I think we can prohibit the relation to
be considered for parallelism if it's target expressions contain any
parallel restricted clause. Fix on those lines is attached with this mail.

Thanks to Dilip Kumar for helping me in narrowing down this particular
problem. We were not able to generate the exact test, but I think the above
theory is sufficient to prove that it can cause a problem as seen in the
original bug report.

I could be wrong, but I thought that the target list for an expression
would always contain only Vars at this stage. Non-default tlists get
injected at the end of scan/join planning. Am I wrong?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#5)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

Robert Haas <robertmhaas@gmail.com> writes:

I could be wrong, but I thought that the target list for an expression
would always contain only Vars at this stage. Non-default tlists get
injected at the end of scan/join planning. Am I wrong?

Target list for a relation, you mean? See relation.h:

* reltarget - Default Path output tlist for this rel; normally contains
* Var and PlaceHolderVar nodes for the values we need to
* output from this relation.
* List is in no particular order, but all rels of an
* appendrel set must use corresponding orders.
* NOTE: in an appendrel child relation, may contain
* arbitrary expressions pulled up from a subquery!

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#6)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

On Thu, May 12, 2016 at 11:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I could be wrong, but I thought that the target list for an expression
would always contain only Vars at this stage. Non-default tlists get
injected at the end of scan/join planning. Am I wrong?

Target list for a relation, you mean? See relation.h:

* reltarget - Default Path output tlist for this rel; normally contains
* Var and PlaceHolderVar nodes for the values we need to
* output from this relation.
* List is in no particular order, but all rels of an
* appendrel set must use corresponding orders.
* NOTE: in an appendrel child relation, may contain
* arbitrary expressions pulled up from a subquery!

Err, wow. That makes my head hurt. Can you explain why this case
only arises for appendrel children, and not for plain rels?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#7)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

Robert Haas <robertmhaas@gmail.com> writes:

Target list for a relation, you mean? See relation.h:

* reltarget - Default Path output tlist for this rel; normally contains
* Var and PlaceHolderVar nodes for the values we need to
* output from this relation.
* List is in no particular order, but all rels of an
* appendrel set must use corresponding orders.
* NOTE: in an appendrel child relation, may contain
* arbitrary expressions pulled up from a subquery!

Err, wow. That makes my head hurt. Can you explain why this case
only arises for appendrel children, and not for plain rels?

Well, plain rels only output Vars ;-)

But consider an appendrel representing

(SELECT x+1 FROM t1 UNION ALL SELECT y+2 FROM t2) ss(a)

The RTE for ss will have a reltarget list containing just "a".
Once we pull up the subqueries, the reltarget lists for the two child
appendrel members will need to contain "x+1" and "y+2" in order to be
equivalent to the parent's reltarget list. See set_append_rel_size(),
which does that transformation.

This doesn't happen with ordinary subquery flattening because there
isn't a RelOptInfo corresponding to an ordinary subquery that's been
pulled up into the parent query.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Amit Kapila
amit.kapila16@gmail.com
In reply to: Tom Lane (#8)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

On Thu, May 12, 2016 at 11:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Target list for a relation, you mean? See relation.h:

* reltarget - Default Path output tlist for this rel; normally

contains

* Var and PlaceHolderVar nodes for the values we need

to

* output from this relation.
* List is in no particular order, but all rels of an
* appendrel set must use corresponding orders.
* NOTE: in an appendrel child relation, may contain
* arbitrary expressions pulled up from a subquery!

Err, wow. That makes my head hurt. Can you explain why this case
only arises for appendrel children, and not for plain rels?

Well, plain rels only output Vars ;-)

Does this mean that base rels can't contain PlaceHolderVars? Isn't it
possible in below code:

query_planner()
{
..
/*
* Now distribute "placeholders" to base rels as needed. This has to be
* done after join removal because removal could change whether a
* placeholder is evaluatable at a base rel.
*/
add_placeholders_to_base_rels(root);
..
}

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#10Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#9)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

On Fri, May 13, 2016 at 9:43 AM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Thu, May 12, 2016 at 11:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Target list for a relation, you mean? See relation.h:

* reltarget - Default Path output tlist for this rel; normally

contains

* Var and PlaceHolderVar nodes for the values we

need to

* output from this relation.
* List is in no particular order, but all rels of an
* appendrel set must use corresponding orders.
* NOTE: in an appendrel child relation, may contain
* arbitrary expressions pulled up from a subquery!

Err, wow. That makes my head hurt. Can you explain why this case
only arises for appendrel children, and not for plain rels?

Well, plain rels only output Vars ;-)

Does this mean that base rels can't contain PlaceHolderVars? Isn't it

possible in below code:

Here I want to ask base rels which are plain rels?

It might be that I am missing something, but if we debug the serial plan
for original query [1]select ref_68.is_trigger_deletable as c0, (select d from public.renamecolumnchild limit 1 offset 15) as c1, ref_68.is_trigger_insertable_into as c2, 3 as c3 from information_schema.role_udt_grants as ref_67 left join information_schema.views as ref_68 on (ref_67.grantee = ref_68.table_catalog ) where true limit 138; for which this issue is reported, we have noticed
that PlaceHolderVars that contain subplans are added to base rels for which
RTE kind is (RTE_RELATION).

[1]: select ref_68.is_trigger_deletable as c0, (select d from public.renamecolumnchild limit 1 offset 15) as c1, ref_68.is_trigger_insertable_into as c2, 3 as c3 from information_schema.role_udt_grants as ref_67 left join information_schema.views as ref_68 on (ref_67.grantee = ref_68.table_catalog ) where true limit 138;
public.renamecolumnchild limit 1 offset 15) as c1,
ref_68.is_trigger_insertable_into as c2, 3 as c3 from
information_schema.role_udt_grants as ref_67 left join
information_schema.views as ref_68 on (ref_67.grantee =
ref_68.table_catalog ) where true limit 138;

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#11Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#10)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

On Fri, May 13, 2016 at 10:31 AM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

Here I want to ask base rels which are plain rels?

It might be that I am missing something, but if we debug the serial plan
for original query [1] for which this issue is reported, we have noticed
that PlaceHolderVars that contain subplans are added to base rels for which
RTE kind is (RTE_RELATION).

[1] - select ref_68.is_trigger_deletable as c0, (select d from

Finally, I could reproduce this issue, with just three tables.

If targetlist of top query has subquery references and if subquery is part
of OUTER join, then while pulling up the subquery,
it will create PlaceHolderVar, which will have expressions.

After applying prohibit_parallel_clause_below_rel_v1.patch, it is not
selecting parallel plan.
So now its clear that because of sub query pullup, we may get expression in
targetlist while creating single table path list. So we need to avoid
parallel plan if it contains expression.

Below is my test:
--------------------------

postgres=# CREATE TABLE t1(c1, c2) AS SELECT g, repeat('x', 5) FROM
postgres-# generate_series(1, 10000000) g;

SELECT 10000000
postgres=#
postgres=# CREATE TABLE t2(c1, c2) AS SELECT g, repeat('x', 5) FROM
postgres-# generate_series(1, 1000000) g;
SELECT 1000000
postgres=# CREATE TABLE t3(c1, c2) AS SELECT g, repeat('x', 5) FROM
generate_series(1, 1000000) g;
SELECT 1000000
postgres=# analyze t1;
ANALYZE
postgres=# analyze t2;
ANALYZE
postgres=# analyze t3;
ANALYZE

postgres=# explain select t1.c1, y.x from t1 left join (select (select
t3.c1 from t3 where t3.c1=t2.c1) as x from t2)y on (y.x=t1.c1);
QUERY PLAN

-------------------------------------------------------------------------------------
Hash Right Join (cost=319113.85..96991333895.85 rows=9999860 width=8)
Hash Cond: ((SubPlan 2) = t1.c1)
-> Gather (cost=1000.00..7460943906.00 rows=1000000 width=8)
Workers Planned: 2
-> Parallel Seq Scan on t2 (cost=0.00..7460842906.00 rows=416667
width=8)
SubPlan 1
-> Seq Scan on t3 (cost=0.00..17906.00 rows=1 width=4)
Filter: (c1 = t2.c1)
-> Hash (cost=154053.60..154053.60 rows=9999860 width=4)
-> Seq Scan on t1 (cost=0.00..154053.60 rows=9999860 width=4)
SubPlan 2
-> Seq Scan on t3 t3_1 (cost=0.00..17906.00 rows=1 width=4)
Filter: (c1 = t2.c1)
(13 rows)

postgres=# select t1.c1, y.x from t1 left join (select (select t3.c1 from
t3 where t3.c1=t2.c1) as x from t2)y on (y.x=t1.c1);
LOG: worker process: parallel worker for PID 109446 (PID 109483) was
terminated by signal 11: Segmentation fault
LOG: terminating any other active server processes
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#12Andreas Seltenreich
seltenreich@gmx.de
In reply to: Dilip Kumar (#11)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

Amit Kapila writes:

avoid_restricted_clause_below_gather_v1.patch
prohibit_parallel_clause_below_rel_v1.patch

I didn't observe any parallel worker related coredumps since applying
these. The same amount of testing done before applying them yielded
about a dozend.

Dilip Kumar writes:

So now its clear that because of sub query pullup, we may get expression in
targetlist while creating single table path list. So we need to avoid
parallel plan if it contains expression.

This sounds like a rather heavy restriction though…

regards,
Andreas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Amit Kapila
amit.kapila16@gmail.com
In reply to: Andreas Seltenreich (#12)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

On Sun, May 22, 2016 at 9:32 PM, Andreas Seltenreich <seltenreich@gmx.de>
wrote:

Amit Kapila writes:

avoid_restricted_clause_below_gather_v1.patch
prohibit_parallel_clause_below_rel_v1.patch

I didn't observe any parallel worker related coredumps since applying
these. The same amount of testing done before applying them yielded
about a dozend.

Thanks for verification.

Dilip Kumar writes:

So now its clear that because of sub query pullup, we may get

expression in

targetlist while creating single table path list. So we need to avoid
parallel plan if it contains expression.

This sounds like a rather heavy restriction though…

I think what Dilip means by above statement is to avoid parallel plan if
target list contains parallel unsafe or restricted expressions. We already
restrict generation of parallel plans if qualification for a relation
contains such expressions (refer set_rel_consider_parallel()), so this
doesn't sound to be heavy restriction.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#14Amit Kapila
amit.kapila16@gmail.com
In reply to: Tom Lane (#8)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

On Thu, May 12, 2016 at 11:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Target list for a relation, you mean? See relation.h:

* reltarget - Default Path output tlist for this rel; normally

contains

* Var and PlaceHolderVar nodes for the values we need

to

* output from this relation.
* List is in no particular order, but all rels of an
* appendrel set must use corresponding orders.
* NOTE: in an appendrel child relation, may contain
* arbitrary expressions pulled up from a subquery!

Err, wow. That makes my head hurt. Can you explain why this case
only arises for appendrel children, and not for plain rels?

Well, plain rels only output Vars ;-)

But consider an appendrel representing

(SELECT x+1 FROM t1 UNION ALL SELECT y+2 FROM t2) ss(a)

The RTE for ss will have a reltarget list containing just "a".
Once we pull up the subqueries, the reltarget lists for the two child
appendrel members will need to contain "x+1" and "y+2" in order to be
equivalent to the parent's reltarget list. See set_append_rel_size(),
which does that transformation.

I think this can also lead to an issue, as currently for child relations,
we just copy consider_parallel flag of parent. Now, if the child rel
target list is adjusted such that it contains parallel restricted
expression, then we are bound to fail. I think to handle append rel case
appropriately, we should compute the consider_parallel flag for each child
relation in set_append_rel_size() and then later when computing the flag
for parent rel, consider the flag for child rels (basically if any child
rel has consider_parallel as false, then set consider_parallel for parent
as false). To achieve this, we need to call set_rel_consider_parallel()
after calling set_rel_size().

Thoughts?

Just to summarize, apart from above issue, we have discussed two different
issues related to parallel query in this thread.
a. Push down of parallel restricted clauses to nodes below gather. Patch
to fix same is posted upthread [1]/messages/by-id/CAA4eK1Ky2=HsTsT4hmfL=EAL5rv0_t59tvWzVK9HQKvN6Dovkw@mail.gmail.com.
b. Wrong assumption that target list can only contain Vars. Patch to fix
same is posted upthread [2]/messages/by-id/CAA4eK1L-Uo=s4=0jvvVA51pj06u5WdDvSQg673yuxJ_Ja+x86Q@mail.gmail.com. Test which prove our assumption is wrong is
also posted upthread [3]/messages/by-id/CAFiTN-vzg5BkK6kAh3OMhvgRu-uJvkjz47ybtopMAfGJp=zWqA@mail.gmail.com.

I will add this issue in list of open issues for 9.6 @
https://wiki.postgresql.org/wiki/PostgreSQL_9.6_Open_Items

[1]: /messages/by-id/CAA4eK1Ky2=HsTsT4hmfL=EAL5rv0_t59tvWzVK9HQKvN6Dovkw@mail.gmail.com
/messages/by-id/CAA4eK1Ky2=HsTsT4hmfL=EAL5rv0_t59tvWzVK9HQKvN6Dovkw@mail.gmail.com
[2]: /messages/by-id/CAA4eK1L-Uo=s4=0jvvVA51pj06u5WdDvSQg673yuxJ_Ja+x86Q@mail.gmail.com
/messages/by-id/CAA4eK1L-Uo=s4=0jvvVA51pj06u5WdDvSQg673yuxJ_Ja+x86Q@mail.gmail.com
[3]: /messages/by-id/CAFiTN-vzg5BkK6kAh3OMhvgRu-uJvkjz47ybtopMAfGJp=zWqA@mail.gmail.com
/messages/by-id/CAFiTN-vzg5BkK6kAh3OMhvgRu-uJvkjz47ybtopMAfGJp=zWqA@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#15Noah Misch
noah@leadboat.com
In reply to: Amit Kapila (#14)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

On Thu, May 26, 2016 at 05:08:31PM +0530, Amit Kapila wrote:

Just to summarize, apart from above issue, we have discussed two different
issues related to parallel query in this thread.
a. Push down of parallel restricted clauses to nodes below gather. Patch
to fix same is posted upthread [1].
b. Wrong assumption that target list can only contain Vars. Patch to fix
same is posted upthread [2]. Test which prove our assumption is wrong is
also posted upthread [3].

I will add this issue in list of open issues for 9.6 @
https://wiki.postgresql.org/wiki/PostgreSQL_9.6_Open_Items

[1] -
/messages/by-id/CAA4eK1Ky2=HsTsT4hmfL=EAL5rv0_t59tvWzVK9HQKvN6Dovkw@mail.gmail.com
[2] -
/messages/by-id/CAA4eK1L-Uo=s4=0jvvVA51pj06u5WdDvSQg673yuxJ_Ja+x86Q@mail.gmail.com
[3] -
/messages/by-id/CAFiTN-vzg5BkK6kAh3OMhvgRu-uJvkjz47ybtopMAfGJp=zWqA@mail.gmail.com

[Action required within 72 hours. This is a generic notification.]

The above-described topic is currently a PostgreSQL 9.6 open item. Robert,
since you committed the patch believed to have created it, you own this open
item. If some other commit is more relevant or if this does not belong as a
9.6 open item, please let us know. Otherwise, please observe the policy on
open item ownership[1]/messages/by-id/20160527025039.GA447393@tornado.leadboat.com and send a status update within 72 hours of this
message. Include a date for your subsequent status update. Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping 9.6rc1. Consequently, I will appreciate your
efforts toward speedy resolution. Thanks.

[1]: /messages/by-id/20160527025039.GA447393@tornado.leadboat.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#3)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

On Sat, May 7, 2016 at 9:07 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

From the above output it is clear that parallel restricted function is
pushed down below gather node. I found that though we have have care fully
avoided to push pathtarget below GatherPath in apply_projection_to_path() if
pathtarget contains any parallel unsafe or parallel restricted clause, but
we are separately also trying to apply pathtarget to partialpath list which
doesn't seem to be the correct way even if it is required. I think this has
been added during parallel aggregate patch and it seems to me this is not
required after the changes related to GatherPath in
apply_projection_to_path().

Good catch. Committed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#8)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

On Thu, May 12, 2016 at 2:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Err, wow. That makes my head hurt. Can you explain why this case
only arises for appendrel children, and not for plain rels?

Well, plain rels only output Vars ;-)

Hmm. Dilip's example in
/messages/by-id/CAFiTN-vzg5BkK6kAh3OMhvgRu-uJvkjz47ybtopMAfGJp=zWqA@mail.gmail.com
seems to show that it's possible to end up with a targetlist
containing non-Vars even for a baserel. In that example,
pull_up_subqueries() injects t2 into the parent query with a target
list containing two items, one of which is a PlaceHolderVar
referencing the subplan for t3.

Unfortunately, that makes it a bit hard to avoid wasting cycles here.
Amit's patch in
/messages/by-id/CAA4eK1L-Uo=s4=0jvvVA51pj06u5WdDvSQg673yuxJ_Ja+x86Q@mail.gmail.com
is a bit brute force: it just checks the target list for every
relation for parallel-restricted constructs. In most cases, no
subqueries having been pulled up, it will find none, but it'll still
have to walk the whole target list to figure that out. If we had some
way of knowing that nothing was pulled up into the current rel, we
could avoid that. This seems to apply equally to baserels and
appendrels; both will commonly have a target list containing only
Vars, but both can end up with non-Vars in the target list after
subqueries are pulled up.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Robert Haas
robertmhaas@gmail.com
In reply to: Noah Misch (#15)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

On Wed, Jun 1, 2016 at 9:35 PM, Noah Misch <noah@leadboat.com> wrote:

[Action required within 72 hours. This is a generic notification.]

The above-described topic is currently a PostgreSQL 9.6 open item. Robert,
since you committed the patch believed to have created it, you own this open
item. If some other commit is more relevant or if this does not belong as a
9.6 open item, please let us know. Otherwise, please observe the policy on
open item ownership[1] and send a status update within 72 hours of this
message. Include a date for your subsequent status update. Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping 9.6rc1. Consequently, I will appreciate your
efforts toward speedy resolution. Thanks.

Status update:

I'm working on it. I think there are several issues here, of which
I've committed a fix for just one. I hope to have them all fixed by
next Friday. If not, I'll send another update then.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#17)
1 attachment(s)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

On Sat, Jun 4, 2016 at 1:53 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, May 12, 2016 at 2:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Err, wow. That makes my head hurt. Can you explain why this case
only arises for appendrel children, and not for plain rels?

Well, plain rels only output Vars ;-)

Hmm. Dilip's example in

/messages/by-id/CAFiTN-vzg5BkK6kAh3OMhvgRu-uJvkjz47ybtopMAfGJp=zWqA@mail.gmail.com

seems to show that it's possible to end up with a targetlist
containing non-Vars even for a baserel. In that example,
pull_up_subqueries() injects t2 into the parent query with a target
list containing two items, one of which is a PlaceHolderVar
referencing the subplan for t3.

Unfortunately, that makes it a bit hard to avoid wasting cycles here.
Amit's patch in

/messages/by-id/CAA4eK1L-Uo=s4=0jvvVA51pj06u5WdDvSQg673yuxJ_Ja+x86Q@mail.gmail.com

is a bit brute force: it just checks the target list for every
relation for parallel-restricted constructs. In most cases, no
subqueries having been pulled up, it will find none, but it'll still
have to walk the whole target list to figure that out. If we had some
way of knowing that nothing was pulled up into the current rel, we
could avoid that. This seems to apply equally to baserels and
appendrels;

That seems doable, as for such rels we can only have Vars and
PlaceHolderVars in targetlist. Basically, whenever we are adding
PlaceHolderVars to a relation, just remember that information and use it
later. The patch doing the same is attached with this mail. Now still,
this won't cover the case of ChildRels for an Append Relation as for that
we adjust target list separately in set_append_rel_size. I think we need
to deal with it separately.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

prohibit_parallel_clause_below_rel_v2.patchapplication/octet-stream; name=prohibit_parallel_clause_below_rel_v2.patchDownload
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index c2f0e0f..94d114a 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2083,6 +2083,7 @@ _outRelOptInfo(StringInfo str, const RelOptInfo *node)
 	WRITE_BOOL_FIELD(consider_param_startup);
 	WRITE_BOOL_FIELD(consider_parallel);
 	WRITE_NODE_FIELD(reltarget);
+	WRITE_BOOL_FIELD(has_target_non_vars);
 	WRITE_NODE_FIELD(pathlist);
 	WRITE_NODE_FIELD(ppilist);
 	WRITE_NODE_FIELD(partial_pathlist);
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 873a764..e185d47 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -598,14 +598,16 @@ set_rel_consider_parallel(PlannerInfo *root, RelOptInfo *rel,
 	}
 
 	/*
-	 * If there's anything in baserestrictinfo that's parallel-restricted,
-	 * we give up on parallelizing access to this relation.  We could consider
-	 * instead postponing application of the restricted quals until we're
-	 * above all the parallelism in the plan tree, but it's not clear that
-	 * this would be a win in very many cases, and it might be tricky to make
-	 * outer join clauses work correctly.
+	 * If there's anything in baserestrictinfo or reltarget that's
+	 * parallel-restricted, we give up on parallelizing access to this
+	 * relation.  We could consider instead postponing application of the
+	 * restricted clauses until we're above all the parallelism in the plan
+	 * tree, but it's not clear that this would be a win in very many cases.
+	 * It might be tricky to make outer join clauses work correctly.
 	 */
-	if (has_parallel_hazard((Node *) rel->baserestrictinfo, false))
+	if (has_parallel_hazard((Node *) rel->baserestrictinfo, false) ||
+		(rel->has_target_non_vars &&
+		 has_parallel_hazard((Node *) rel->reltarget->exprs, false)))
 		return;
 
 	/* We have a winner. */
diff --git a/src/backend/optimizer/util/placeholder.c b/src/backend/optimizer/util/placeholder.c
index b210914..0423ae0 100644
--- a/src/backend/optimizer/util/placeholder.c
+++ b/src/backend/optimizer/util/placeholder.c
@@ -393,6 +393,7 @@ add_placeholders_to_base_rels(PlannerInfo *root)
 
 			rel->reltarget->exprs = lappend(rel->reltarget->exprs,
 											copyObject(phinfo->ph_var));
+			rel->has_target_non_vars = true;
 			/* reltarget's cost and width fields will be updated later */
 		}
 	}
@@ -427,6 +428,7 @@ add_placeholders_to_joinrel(PlannerInfo *root, RelOptInfo *joinrel,
 				/* Yup, add it to the output */
 				joinrel->reltarget->exprs = lappend(joinrel->reltarget->exprs,
 													phinfo->ph_var);
+				joinrel->has_target_non_vars = true;
 				joinrel->reltarget->width += phinfo->ph_width;
 
 				/*
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 1e87a73..21a61b1 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -109,6 +109,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind)
 	rel->consider_parallel = false;		/* might get changed later */
 	rel->rel_parallel_degree = -1; /* set up in GetRelationInfo */
 	rel->reltarget = create_empty_pathtarget();
+	rel->has_target_non_vars = false;
 	rel->pathlist = NIL;
 	rel->ppilist = NIL;
 	rel->partial_pathlist = NIL;
@@ -396,6 +397,7 @@ build_join_rel(PlannerInfo *root,
 	joinrel->consider_param_startup = false;
 	joinrel->consider_parallel = false;
 	joinrel->reltarget = create_empty_pathtarget();
+	joinrel->has_target_non_vars = false;
 	joinrel->pathlist = NIL;
 	joinrel->ppilist = NIL;
 	joinrel->partial_pathlist = NIL;
@@ -506,8 +508,8 @@ build_join_rel(PlannerInfo *root,
 	 * Set the consider_parallel flag if this joinrel could potentially be
 	 * scanned within a parallel worker.  If this flag is false for either
 	 * inner_rel or outer_rel, then it must be false for the joinrel also.
-	 * Even if both are true, there might be parallel-restricted quals at our
-	 * level.
+	 * Even if both are true, there might be parallel-restricted expression in
+	 * targetlist or quals at our level.
 	 *
 	 * Note that if there are more than two rels in this relation, they could
 	 * be divided between inner_rel and outer_rel in any arbitrary way.  We
@@ -517,7 +519,9 @@ build_join_rel(PlannerInfo *root,
 	 * here.
 	 */
 	if (inner_rel->consider_parallel && outer_rel->consider_parallel &&
-		!has_parallel_hazard((Node *) restrictlist, false))
+		!has_parallel_hazard((Node *) restrictlist, false) &&
+		!(joinrel->has_target_non_vars &&
+		  has_parallel_hazard((Node *) joinrel->reltarget->exprs, false)))
 		joinrel->consider_parallel = true;
 
 	/*
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 45739c3..f16d932 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -490,6 +490,8 @@ typedef struct RelOptInfo
 
 	/* default result targetlist for Paths scanning this relation */
 	struct PathTarget *reltarget;		/* list of Vars/Exprs, cost, width */
+	bool		has_target_non_vars;	/* true if any expression in
+										 * PathTarget is NonVar */
 
 	/* materialization information */
 	List	   *pathlist;		/* Path structures */
#20Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#19)
1 attachment(s)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

On Mon, Jun 6, 2016 at 6:07 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

That seems doable, as for such rels we can only have Vars and
PlaceHolderVars in targetlist. Basically, whenever we are adding
PlaceHolderVars to a relation, just remember that information and use it
later. The patch doing the same is attached with this mail. Now still,
this won't cover the case of ChildRels for an Append Relation as for that we
adjust target list separately in set_append_rel_size. I think we need to
deal with it separately.

This approach looks pretty good to me. Here's a revised version of
your patch, with some renaming and other adjustments. I'm not sure
exactly what you're referring to in set_append_rel_size, but I did add
a line of code there that might be relevant.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

check-tlists-rmh.patchbinary/octet-stream; name=check-tlists-rmh.patchDownload
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 389808c..06e6f00 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2083,6 +2083,7 @@ _outRelOptInfo(StringInfo str, const RelOptInfo *node)
 	WRITE_BOOL_FIELD(consider_param_startup);
 	WRITE_BOOL_FIELD(consider_parallel);
 	WRITE_NODE_FIELD(reltarget);
+	WRITE_BOOL_FIELD(reltarget_has_non_vars);
 	WRITE_NODE_FIELD(pathlist);
 	WRITE_NODE_FIELD(ppilist);
 	WRITE_NODE_FIELD(partial_pathlist);
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 873a764..5d007fa 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -608,6 +608,15 @@ set_rel_consider_parallel(PlannerInfo *root, RelOptInfo *rel,
 	if (has_parallel_hazard((Node *) rel->baserestrictinfo, false))
 		return;
 
+	/*
+	 * If the relation's outputs are not parallel-safe, we must give up.
+	 * In the common case where the relation only outputs Vars, this check is
+	 * very cheap; otherwise, we have to do more work.
+	 */
+	if (rel->reltarget_has_non_vars &&
+		has_parallel_hazard((Node *) rel->reltarget->exprs, false))
+		return;
+
 	/* We have a winner. */
 	rel->consider_parallel = true;
 }
@@ -975,6 +984,7 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 			adjust_appendrel_attrs(root,
 								   (Node *) rel->reltarget->exprs,
 								   appinfo);
+		childrel->reltarget_has_non_vars = rel->reltarget_has_non_vars;
 
 		/*
 		 * We have to make child entries in the EquivalenceClass data
diff --git a/src/backend/optimizer/util/placeholder.c b/src/backend/optimizer/util/placeholder.c
index b210914..5b85a4d 100644
--- a/src/backend/optimizer/util/placeholder.c
+++ b/src/backend/optimizer/util/placeholder.c
@@ -393,6 +393,7 @@ add_placeholders_to_base_rels(PlannerInfo *root)
 
 			rel->reltarget->exprs = lappend(rel->reltarget->exprs,
 											copyObject(phinfo->ph_var));
+			rel->reltarget_has_non_vars = true;
 			/* reltarget's cost and width fields will be updated later */
 		}
 	}
@@ -427,6 +428,7 @@ add_placeholders_to_joinrel(PlannerInfo *root, RelOptInfo *joinrel,
 				/* Yup, add it to the output */
 				joinrel->reltarget->exprs = lappend(joinrel->reltarget->exprs,
 													phinfo->ph_var);
+				joinrel->reltarget_has_non_vars = true;
 				joinrel->reltarget->width += phinfo->ph_width;
 
 				/*
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 1e87a73..b278299 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -109,6 +109,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind)
 	rel->consider_parallel = false;		/* might get changed later */
 	rel->rel_parallel_degree = -1; /* set up in GetRelationInfo */
 	rel->reltarget = create_empty_pathtarget();
+	rel->reltarget_has_non_vars = false;
 	rel->pathlist = NIL;
 	rel->ppilist = NIL;
 	rel->partial_pathlist = NIL;
@@ -396,6 +397,7 @@ build_join_rel(PlannerInfo *root,
 	joinrel->consider_param_startup = false;
 	joinrel->consider_parallel = false;
 	joinrel->reltarget = create_empty_pathtarget();
+	joinrel->reltarget_has_non_vars = false;
 	joinrel->pathlist = NIL;
 	joinrel->ppilist = NIL;
 	joinrel->partial_pathlist = NIL;
@@ -506,8 +508,8 @@ build_join_rel(PlannerInfo *root,
 	 * Set the consider_parallel flag if this joinrel could potentially be
 	 * scanned within a parallel worker.  If this flag is false for either
 	 * inner_rel or outer_rel, then it must be false for the joinrel also.
-	 * Even if both are true, there might be parallel-restricted quals at our
-	 * level.
+	 * Even if both are true, there might be parallel-restricted expressions
+	 * in the targetlist or quals.
 	 *
 	 * Note that if there are more than two rels in this relation, they could
 	 * be divided between inner_rel and outer_rel in any arbitrary way.  We
@@ -517,7 +519,9 @@ build_join_rel(PlannerInfo *root,
 	 * here.
 	 */
 	if (inner_rel->consider_parallel && outer_rel->consider_parallel &&
-		!has_parallel_hazard((Node *) restrictlist, false))
+		!has_parallel_hazard((Node *) restrictlist, false) &&
+		!(joinrel->reltarget_has_non_vars &&
+		  has_parallel_hazard((Node *) joinrel->reltarget->exprs, false)))
 		joinrel->consider_parallel = true;
 
 	/*
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index b1f6cf4..d392e0b 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -490,6 +490,8 @@ typedef struct RelOptInfo
 
 	/* default result targetlist for Paths scanning this relation */
 	struct PathTarget *reltarget;		/* list of Vars/Exprs, cost, width */
+	bool		reltarget_has_non_vars;	/* true if any expression in
+										 * PathTarget is a non-Var */
 
 	/* materialization information */
 	List	   *pathlist;		/* Path structures */
#21Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#20)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

On Thu, Jun 9, 2016 at 8:47 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Jun 6, 2016 at 6:07 AM, Amit Kapila <amit.kapila16@gmail.com>

wrote:

That seems doable, as for such rels we can only have Vars and
PlaceHolderVars in targetlist. Basically, whenever we are adding
PlaceHolderVars to a relation, just remember that information and use it
later. The patch doing the same is attached with this mail. Now still,
this won't cover the case of ChildRels for an Append Relation as for

that we

adjust target list separately in set_append_rel_size. I think we need

to

deal with it separately.

This approach looks pretty good to me. Here's a revised version of
your patch, with some renaming and other adjustments.

Your version looks good to me.

I'm not sure
exactly what you're referring to in set_append_rel_size,

The below code:

set_append_rel_size()
{
..

/*

* CE failed, so finish copying/modifying targetlist and join quals.
*
*
*NB: the resulting childrel->reltarget->exprs may contain arbitrary*
expressions, which otherwise would not occur in a rel's targetlist*.
..
*/

childrel->reltarget->exprs = (List *)
adjust_appendrel_attrs(root,
(Node *) rel->reltarget->exprs,
appinfo);

What I mean to say is if above code lead to some additional expressions in
childrel targetlist, then those can lead to some unsafe expressions in
child rel target list. This can cause some problem as we are blindly
considering that if parent rel is parallel safe, then child rel will also
be parallel safe (in below code:)

set_append_rel_size()
{
..
/* Copy consider_parallel flag from parent. */
childrel->consider_parallel = rel->consider_parallel;

but I did add
a line of code there that might be relevant.

I am not sure if that can address the above problem. I have posted my
analysis on the same in mail [1]/messages/by-id/CAA4eK1Jz5tG2D2PCNYqYob2cb4dKowKYwVJ7OmP5vwyRyCMx4g@mail.gmail.com.

[1]: /messages/by-id/CAA4eK1Jz5tG2D2PCNYqYob2cb4dKowKYwVJ7OmP5vwyRyCMx4g@mail.gmail.com
/messages/by-id/CAA4eK1Jz5tG2D2PCNYqYob2cb4dKowKYwVJ7OmP5vwyRyCMx4g@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#22Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#21)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

On Thu, Jun 9, 2016 at 9:37 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Jun 9, 2016 at 8:47 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Jun 6, 2016 at 6:07 AM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

That seems doable, as for such rels we can only have Vars and
PlaceHolderVars in targetlist. Basically, whenever we are adding
PlaceHolderVars to a relation, just remember that information and use it
later. The patch doing the same is attached with this mail. Now still,
this won't cover the case of ChildRels for an Append Relation as for
that we
adjust target list separately in set_append_rel_size. I think we need
to
deal with it separately.

This approach looks pretty good to me. Here's a revised version of
your patch, with some renaming and other adjustments.

Your version looks good to me.

I'm not sure
exactly what you're referring to in set_append_rel_size,

The below code:

set_append_rel_size()
{
..

/*

* CE failed, so finish copying/modifying targetlist and join quals.

*
* NB: the resulting childrel->reltarget->exprs may contain arbitrary
* expressions, which otherwise would not occur in a rel's targetlist.
..
*/

childrel->reltarget->exprs = (List *)
adjust_appendrel_attrs(root,
(Node *) rel->reltarget->exprs,
appinfo);

What I mean to say is if above code lead to some additional expressions in
childrel targetlist, then those can lead to some unsafe expressions in child
rel target list. This can cause some problem as we are blindly considering
that if parent rel is parallel safe, then child rel will also be parallel
safe (in below code:)

set_append_rel_size()
{
..
/* Copy consider_parallel flag from parent. */
childrel->consider_parallel = rel->consider_parallel;

but I did add
a line of code there that might be relevant.

I am not sure if that can address the above problem. I have posted my
analysis on the same in mail [1].

[1] -
/messages/by-id/CAA4eK1Jz5tG2D2PCNYqYob2cb4dKowKYwVJ7OmP5vwyRyCMx4g@mail.gmail.com

Do you have a test case that demonstrates the problem with this patch
applied? I agree there may be a problem, but it would be easier to be
sure whether a given change fixes it if we had a good test case.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#22)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

On Thu, Jun 9, 2016 at 7:44 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Jun 9, 2016 at 9:37 AM, Amit Kapila <amit.kapila16@gmail.com>

wrote:

On Thu, Jun 9, 2016 at 8:47 AM, Robert Haas <robertmhaas@gmail.com>

wrote:

On Mon, Jun 6, 2016 at 6:07 AM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

That seems doable, as for such rels we can only have Vars and
PlaceHolderVars in targetlist. Basically, whenever we are adding
PlaceHolderVars to a relation, just remember that information and

use it

later. The patch doing the same is attached with this mail. Now

still,

this won't cover the case of ChildRels for an Append Relation as for
that we
adjust target list separately in set_append_rel_size. I think we

need

to
deal with it separately.

This approach looks pretty good to me. Here's a revised version of
your patch, with some renaming and other adjustments.

Your version looks good to me.

I'm not sure
exactly what you're referring to in set_append_rel_size,

The below code:

set_append_rel_size()
{
..

/*

* CE failed, so finish copying/modifying targetlist and join quals.

*
* NB: the resulting childrel->reltarget->exprs may contain arbitrary
* expressions, which otherwise would not occur in a rel's targetlist.
..
*/

childrel->reltarget->exprs = (List *)
adjust_appendrel_attrs(root,
(Node *) rel->reltarget->exprs,
appinfo);

What I mean to say is if above code lead to some additional expressions

in

childrel targetlist, then those can lead to some unsafe expressions in

child

rel target list. This can cause some problem as we are blindly

considering

that if parent rel is parallel safe, then child rel will also be

parallel

safe (in below code:)

set_append_rel_size()
{
..
/* Copy consider_parallel flag from parent. */
childrel->consider_parallel = rel->consider_parallel;

but I did add
a line of code there that might be relevant.

I am not sure if that can address the above problem. I have posted my
analysis on the same in mail [1].

[1] -

/messages/by-id/CAA4eK1Jz5tG2D2PCNYqYob2cb4dKowKYwVJ7OmP5vwyRyCMx4g@mail.gmail.com

Do you have a test case that demonstrates the problem with this patch
applied?

No, I don't have a test case. I have tried a bit, but it seems even if
there is a problem, one needs to spend quite some time to generate an exact
test. I think the current patch fixes the reproducible problem [1]/messages/by-id/CAFiTN-vzg5BkK6kAh3OMhvgRu-uJvkjz47ybtopMAfGJp=zWqA@mail.gmail.com, so lets
go with it. We can look into Append case problem if some one has a
reproducible case or if I could come up with one. In any case, I think it
is a separate problem then what is reported in this thread.

[1]: /messages/by-id/CAFiTN-vzg5BkK6kAh3OMhvgRu-uJvkjz47ybtopMAfGJp=zWqA@mail.gmail.com
/messages/by-id/CAFiTN-vzg5BkK6kAh3OMhvgRu-uJvkjz47ybtopMAfGJp=zWqA@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#24Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#23)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

On Thu, Jun 9, 2016 at 10:44 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

No, I don't have a test case. I have tried a bit, but it seems even if
there is a problem, one needs to spend quite some time to generate an exact
test. I think the current patch fixes the reproducible problem [1], so lets
go with it. We can look into Append case problem if some one has a
reproducible case or if I could come up with one. In any case, I think it
is a separate problem then what is reported in this thread.

OK, committed, then. I have an idea about what that other problem
might be that I'll investigate separately. For now, I'm marking this
open item as closed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers