Query running for very long time (server hanged) with parallel append

Started by Rajkumar Raghuwanshialmost 8 years ago17 messages
#1Rajkumar Raghuwanshi
rajkumar.raghuwanshi@enterprisedb.com

Hi,

I am getting server hang kind of issue with the below postgres.conf setup.
Issue may occur while running below query single/multiple times (random).
Not getting terminal back even after cancelling query.
explain output and query is given below.

SET enable_hashjoin TO off;
SET enable_nestloop TO off;
SET enable_seqscan TO off;
SET parallel_setup_cost=0;
SET parallel_tuple_cost=0;
SET max_parallel_workers_per_gather=4;
SET enable_parallel_append = on;
SET enable_partition_wise_join TO true;

CREATE TABLE plt1 (a int, b int, c text) PARTITION BY LIST(c);
CREATE TABLE plt1_p1 PARTITION OF plt1 FOR VALUES IN ('0000', '0003',
'0004', '0010');
CREATE TABLE plt1_p2 PARTITION OF plt1 FOR VALUES IN ('0001', '0005',
'0002', '0009');
CREATE TABLE plt1_p3 PARTITION OF plt1 FOR VALUES IN ('0006', '0007',
'0008', '0011');
INSERT INTO plt1 SELECT i, i, to_char(i%12, 'FM0000') FROM
generate_series(0, 599, 2) i;

CREATE TABLE plt2 (a int, b int, c text) PARTITION BY LIST(c);
CREATE TABLE plt2_p1 PARTITION OF plt2 FOR VALUES IN ('0000', '0003',
'0004', '0010');
CREATE TABLE plt2_p2 PARTITION OF plt2 FOR VALUES IN ('0001', '0005',
'0002', '0009');
CREATE TABLE plt2_p3 PARTITION OF plt2 FOR VALUES IN ('0006', '0007',
'0008', '0011');
INSERT INTO plt2 SELECT i, i, to_char(i%12, 'FM0000') FROM
generate_series(0, 599, 3) i;

CREATE INDEX iplt1_p1_c on plt1_p1(c);
CREATE INDEX iplt1_p2_c on plt1_p2(c);
CREATE INDEX iplt1_p3_c on plt1_p3(c);
CREATE INDEX iplt2_p1_c on plt2_p1(c);
CREATE INDEX iplt2_p2_c on plt2_p2(c);
CREATE INDEX iplt2_p3_c on plt2_p3(c);

ANALYZE plt1;
ANALYZE plt2;

EXPLAIN (COSTS OFF) SELECT DISTINCT t1.c,count(*) FROM plt1 t1 LEFT JOIN
LATERAL (SELECT t2.c AS t2c, t3.c AS t3c, least(t1.c,t2.c,t3.c) FROM plt1
t2 JOIN plt2 t3 ON (t2.c = t3.c)) ss ON t1.c = ss.t2c WHERE t1.a % 25 = 0
GROUP BY 1 ORDER BY 1,2;
QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------
Unique
-> Sort
Sort Key: t1.c, (count(*))
-> Finalize GroupAggregate
Group Key: t1.c
-> Sort
Sort Key: t1.c
-> Gather
Workers Planned: 2
-> Partial HashAggregate
Group Key: t1.c
-> Parallel Append
-> Merge Right Join
Merge Cond: (t2.c = t1.c)
-> Merge Join
Merge Cond: (t3.c = t2.c)
-> Index Only Scan
using iplt2_p1_c on plt2_p1 t3
-> Materialize
-> Index Only
Scan using iplt1_p1_c on plt1_p1 t2
-> Materialize
-> Index Scan using
iplt1_p1_c on plt1_p1 t1
Filter: ((a % 25)
= 0)
-> Merge Left Join
Merge Cond: (t1_2.c = t2_2.c)
-> Parallel Index Scan using
iplt1_p3_c on plt1_p3 t1_2
Filter: ((a % 25) = 0)
-> Materialize
-> Merge Join
Merge Cond:
(t3_2.c = t2_2.c)
-> Index Only
Scan using iplt2_p3_c on plt2_p3 t3_2
-> Materialize
-> Index
Only Scan using iplt1_p3_c on plt1_p3 t2_2
-> Merge Left Join
Merge Cond: (t1_1.c = t2_1.c)
-> Parallel Index Scan using
iplt1_p2_c on plt1_p2 t1_1
Filter: ((a % 25) = 0)
-> Materialize
-> Merge Join
Merge Cond:
(t2_1.c = t3_1.c)
-> Index Only
Scan using iplt1_p2_c on plt1_p2 t2_1
-> Index Only
Scan using iplt2_p2_c on plt2_p2 t3_1
(41 rows)

SELECT DISTINCT t1.c,count(*) FROM plt1 t1 LEFT JOIN LATERAL (SELECT t2.c
AS t2c, t3.c AS t3c, least(t1.c,t2.c,t3.c) FROM plt1 t2 JOIN plt2 t3 ON
(t2.c = t3.c)) ss ON t1.c = ss.t2c WHERE t1.a % 25 = 0 GROUP BY 1 ORDER BY
1,2;
.
.
.
"hanged".

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation

#2Amit Khandekar
amitdkhan.pg@gmail.com
In reply to: Rajkumar Raghuwanshi (#1)
Re: Query running for very long time (server hanged) with parallel append

Thanks Rajkumar for catching this. I will have a look ...

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

#3Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Rajkumar Raghuwanshi (#1)
Re: Query running for very long time (server hanged) with parallel append

On Thu, Feb 1, 2018 at 11:29 PM, Rajkumar Raghuwanshi
<rajkumar.raghuwanshi@enterprisedb.com> wrote:

I am getting server hang kind of issue with the below postgres.conf setup.
Issue may occur while running below query single/multiple times (random).
Not getting terminal back even after cancelling query.
explain output and query is given below.

Whatever logic bug might be causing the query to hang, it's not good
that we're unable to SIGINT/SIGTERM our way out of this state. See
also this other bug report for a known problem (already fixed but not
yet released), but which came with an extra complaint, as yet
unexplained, that the query couldn't be interrupted:

/messages/by-id/151724453314.1238.409882538067070269@wrigleys.postgresql.org

--
Thomas Munro
http://www.enterprisedb.com

#4Amit Khandekar
amitdkhan.pg@gmail.com
In reply to: Thomas Munro (#3)
Re: Query running for very long time (server hanged) with parallel append

On 2 February 2018 at 03:50, Thomas Munro <thomas.munro@enterprisedb.com> wrote:

Whatever logic bug might be causing the query to hang, it's not good
that we're unable to SIGINT/SIGTERM our way out of this state. See
also this other bug report for a known problem (already fixed but not
yet released), but which came with an extra complaint, as yet
unexplained, that the query couldn't be interrupted:

/messages/by-id/151724453314.1238.409882538067070269@wrigleys.postgresql.org

Yeah, it is not good that there is no response to the SIGINT.

The query is actually hanging because one of the workers is in a small
loop where it iterates over the subplans searching for unfinished
plans, and it never comes out of the loop (it's a bug which I am yet
to fix). And it does not make sense to keep CHECK_FOR_INTERRUPTS in
each iteration; it's a small loop that does not pass control to any
other functions .

But I am not sure about this : while the workers are at it, why the
backend that is waiting for the workers does not come out of the wait
state with a SIGINT. I guess the same issue has been discussed in the
mail thread that you pointed.

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

#5Robert Haas
robertmhaas@gmail.com
In reply to: Amit Khandekar (#4)
Re: Query running for very long time (server hanged) with parallel append

On Fri, Feb 2, 2018 at 1:43 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:

On 2 February 2018 at 03:50, Thomas Munro <thomas.munro@enterprisedb.com> wrote:

Whatever logic bug might be causing the query to hang, it's not good
that we're unable to SIGINT/SIGTERM our way out of this state. See
also this other bug report for a known problem (already fixed but not
yet released), but which came with an extra complaint, as yet
unexplained, that the query couldn't be interrupted:

/messages/by-id/151724453314.1238.409882538067070269@wrigleys.postgresql.org

Yeah, it is not good that there is no response to the SIGINT.

The query is actually hanging because one of the workers is in a small
loop where it iterates over the subplans searching for unfinished
plans, and it never comes out of the loop (it's a bug which I am yet
to fix). And it does not make sense to keep CHECK_FOR_INTERRUPTS in
each iteration; it's a small loop that does not pass control to any
other functions .

Uh, sounds like we'd better fix that bug.

But I am not sure about this : while the workers are at it, why the
backend that is waiting for the workers does not come out of the wait
state with a SIGINT. I guess the same issue has been discussed in the
mail thread that you pointed.

Is it getting stuck here?

/*
* We can't finish transaction commit or abort until all of the workers
* have exited. This means, in particular, that we can't respond to
* interrupts at this stage.
*/
HOLD_INTERRUPTS();
WaitForParallelWorkersToExit(pcxt);
RESUME_INTERRUPTS();

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#6David Kohn
djk447@gmail.com
In reply to: Robert Haas (#5)
Re: Query running for very long time (server hanged) with parallel append

Please forgive my inexperience with the codebase, but as the guy who
reported this bugger:
/messages/by-id/151724453314.1238.409882538067070269@wrigleys.postgresql.org
I thought I'd follow your hints, as it's causing some major issues for me.
So some notes on what is happening for me and some (possibly silly)
thoughts on why:

On Fri, Feb 2, 2018 at 10:16 AM Robert Haas <robertmhaas@gmail.com> wrote:

Is it getting stuck here?

/*
* We can't finish transaction commit or abort until all of the workers
* have exited. This means, in particular, that we can't respond to
* interrupts at this stage.
*/
HOLD_INTERRUPTS();
WaitForParallelWorkersToExit(pcxt);
RESUME_INTERRUPTS();

I am seeing unkillable queries with the client backend in
IPC-BgWorkerShutdown wait event, which, it appears to me can only happen
inside of bgworker.c at WaitForBackgroundWorkerShutdown which is called by
parallel.c at WaitForParallelWorkersToExit inside of
DestroyParallelContext, which seems like it should be called when there is
a statement timeout (which I think is happening in at least some of my
cases) so it would make sense that this is where the problem is.

My background workers are in the IPC-MessageQueuePutMessage event, which
appears to only be possible from pqmq.c at mq_putmessage , directly
following the WaitLatch, there is a CHECK_FOR_INTERRUPTS(); so, if it's
waiting on that latch and never gets to the interrupt that would explain
things. Also it appears that it sends a signal to the leader process a few
lines before starting to wait, which is supposed to tell the leader to come
read messages off the queue. If the leader gets to
WaitForParallelWorkersToExit at the wrong time and ends up waiting on that
event, I could see how they would both end up waiting for the other and
never finishing.

The thing is that DestroyParallelContext seems to be detaching from the
queues, but if the worker hit the wait step before the leader detaches from
the queue does it have any way of knowing that?

Anyway, I'm entirely unsure of my analysis here, but thought I'd offer
something to help speed this along.

Best,
David Kohn

#7Amit Khandekar
amitdkhan.pg@gmail.com
In reply to: Robert Haas (#5)
1 attachment(s)
Re: Query running for very long time (server hanged) with parallel append

On 2 February 2018 at 20:46, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Feb 2, 2018 at 1:43 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:

The query is actually hanging because one of the workers is in a small
loop where it iterates over the subplans searching for unfinished
plans, and it never comes out of the loop (it's a bug which I am yet
to fix). And it does not make sense to keep CHECK_FOR_INTERRUPTS in
each iteration; it's a small loop that does not pass control to any
other functions .

Uh, sounds like we'd better fix that bug.

The scenario is this : One of the workers w1 hasn't yet chosen the
first plan, and all the plans are already finished. So w1 has it's
node->as_whichplan equal to -1. So the below condition in
choose_next_subplan_for_worker() never becomes true for this worker :

if (pstate->pa_next_plan == node->as_whichplan)
{
/* We've tried everything! */
pstate->pa_next_plan = INVALID_SUBPLAN_INDEX;
LWLockRelease(&pstate->pa_lock);
return false;
}

What I think is : we should save the information about which plan we
started the search with, before the loop starts. So, we save the
starting plan value like this, before the loop starts:
initial_plan = pstate->pa_next_plan;

And then break out of the loop when we come back to to this initial plan :
if (initial_plan == pstate->pa_next_plan)
break;

Now, suppose it so happens that initial_plan is a non-partial plan.
And then when we wrap around to the first partial plan, we check
(initial_plan == pstate->pa_next_plan) which will never become true
because initial_plan is less than first_partial_plan.

So what I have done in the patch is : have a flag 'should_wrap_around'
indicating that we should not wrap around. This flag is true when
initial_plan is a non-partial plan, in which case we know that we will
have covered all plans by the time we reach the last plan. So break
from the loop if this flag is false, or if we have reached the initial
plan.

Attached is a patch that fixes this issue on the above lines.

But I am not sure about this : while the workers are at it, why the
backend that is waiting for the workers does not come out of the wait
state with a SIGINT. I guess the same issue has been discussed in the
mail thread that you pointed.

Is it getting stuck here?

/*
* We can't finish transaction commit or abort until all of the workers
* have exited. This means, in particular, that we can't respond to
* interrupts at this stage.
*/
HOLD_INTERRUPTS();
WaitForParallelWorkersToExit(pcxt);
RESUME_INTERRUPTS();

Actually the backend is getting stuck in
choose_next_subplan_for_leader(), in LWLockAcquire(), waiting for the
hanging worker to release the pstate->pa_lock. I think there is
nothing wrong in this, because it is assumed that LWLock wait is going
to be for very short tiime, and because of this bug, the lwlock waits
forever.

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Attachments:

fix_hang_issue.patchapplication/octet-stream; name=fix_hang_issue.patchDownload
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 64a17fb..2dbb658 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -456,6 +456,8 @@ choose_next_subplan_for_worker(AppendState *node)
 {
 	ParallelAppendState *pstate = node->as_pstate;
 	Append	   *append = (Append *) node->ps.plan;
+	int			initial_plan;
+	bool		should_wrap_around;
 
 	/* Backward scan is not supported by parallel-aware plans */
 	Assert(ScanDirectionIsForward(node->ps.state->es_direction));
@@ -473,6 +475,14 @@ choose_next_subplan_for_worker(AppendState *node)
 		return false;
 	}
 
+	/*
+	 * If we are starting the search from a non-partial plan, no point in
+	 * looping back to the first partial plan: we would have covered all the
+	 * plans by the time we reach the last plan.
+	 */
+	initial_plan = pstate->pa_next_plan;
+	should_wrap_around = (initial_plan >= append->first_partial_plan);
+
 	/* Loop until we find a subplan to execute. */
 	while (pstate->pa_finished[pstate->pa_next_plan])
 	{
@@ -481,18 +491,18 @@ choose_next_subplan_for_worker(AppendState *node)
 			/* Advance to next plan. */
 			pstate->pa_next_plan++;
 		}
-		else if (append->first_partial_plan < node->as_nplans)
+		else if (!should_wrap_around)
 		{
-			/* Loop back to first partial plan. */
-			pstate->pa_next_plan = append->first_partial_plan;
+			/* Covered all plans, arrange to bail out. */
+			pstate->pa_next_plan = initial_plan;
 		}
 		else
 		{
-			/* At last plan, no partial plans, arrange to bail out. */
-			pstate->pa_next_plan = node->as_whichplan;
+			/* Loop back to first partial plan. */
+			pstate->pa_next_plan = append->first_partial_plan;
 		}
 
-		if (pstate->pa_next_plan == node->as_whichplan)
+		if (pstate->pa_next_plan == initial_plan)
 		{
 			/* We've tried everything! */
 			pstate->pa_next_plan = INVALID_SUBPLAN_INDEX;
#8Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Amit Khandekar (#7)
Re: Query running for very long time (server hanged) with parallel append

At Mon, 5 Feb 2018 15:29:27 +0530, Amit Khandekar <amitdkhan.pg@gmail.com> wrote in <CAJ3gD9eFR8=kxjPYBEHe34uT9+RYET0VbhGEfSt79eZx3L9E1Q@mail.gmail.com>

On 2 February 2018 at 20:46, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Feb 2, 2018 at 1:43 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:

The query is actually hanging because one of the workers is in a small
loop where it iterates over the subplans searching for unfinished
plans, and it never comes out of the loop (it's a bug which I am yet
to fix). And it does not make sense to keep CHECK_FOR_INTERRUPTS in
each iteration; it's a small loop that does not pass control to any
other functions .

Uh, sounds like we'd better fix that bug.

The scenario is this : One of the workers w1 hasn't yet chosen the
first plan, and all the plans are already finished. So w1 has it's
node->as_whichplan equal to -1. So the below condition in
choose_next_subplan_for_worker() never becomes true for this worker :

if (pstate->pa_next_plan == node->as_whichplan)
{
/* We've tried everything! */
pstate->pa_next_plan = INVALID_SUBPLAN_INDEX;
LWLockRelease(&pstate->pa_lock);
return false;
}

What I think is : we should save the information about which plan we
started the search with, before the loop starts. So, we save the
starting plan value like this, before the loop starts:
initial_plan = pstate->pa_next_plan;

And then break out of the loop when we come back to to this initial plan :
if (initial_plan == pstate->pa_next_plan)
break;

Now, suppose it so happens that initial_plan is a non-partial plan.
And then when we wrap around to the first partial plan, we check
(initial_plan == pstate->pa_next_plan) which will never become true
because initial_plan is less than first_partial_plan.

So what I have done in the patch is : have a flag 'should_wrap_around'
indicating that we should not wrap around. This flag is true when
initial_plan is a non-partial plan, in which case we know that we will
have covered all plans by the time we reach the last plan. So break
from the loop if this flag is false, or if we have reached the initial
plan.

Attached is a patch that fixes this issue on the above lines.

The patch adds two new variables and always sets them but warp
around can happen at most once per call so I think it is enough
to arrange to stop at the wrap around time. Addition to that the
patch is forgetting the case of no partial plan. The loop can
overrun on pa_finished in the case.

I think something like the following would work.

@@ -485,6 +485,13 @@ choose_next_subplan_for_worker(AppendState *node)
     {
       /* Loop back to first partial plan. */
       pstate->pa_next_plan = append->first_partial_plan;
+
+      /*
+       * Arrange to bail out if we are trying to fetch the first partial
+       * plan
+       */
+      if (node->as_whichplan < append->first_partial_plan)
+        node->as_whichplan = append->first_partial_plan;
     }
     else

But I am not sure about this : while the workers are at it, why the
backend that is waiting for the workers does not come out of the wait
state with a SIGINT. I guess the same issue has been discussed in the
mail thread that you pointed.

Is it getting stuck here?

/*
* We can't finish transaction commit or abort until all of the workers
* have exited. This means, in particular, that we can't respond to
* interrupts at this stage.
*/
HOLD_INTERRUPTS();
WaitForParallelWorkersToExit(pcxt);
RESUME_INTERRUPTS();

Actually the backend is getting stuck in
choose_next_subplan_for_leader(), in LWLockAcquire(), waiting for the
hanging worker to release the pstate->pa_lock. I think there is
nothing wrong in this, because it is assumed that LWLock wait is going
to be for very short tiime, and because of this bug, the lwlock waits
forever.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

#9Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Kyotaro HORIGUCHI (#8)
Re: Query running for very long time (server hanged) with parallel append

At Tue, 06 Feb 2018 13:34:19 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in <20180206.133419.02213593.horiguchi.kyotaro@lab.ntt.co.jp>

At Mon, 5 Feb 2018 15:29:27 +0530, Amit Khandekar <amitdkhan.pg@gmail.com> wrote in <CAJ3gD9eFR8=kxjPYBEHe34uT9+RYET0VbhGEfSt79eZx3L9E1Q@mail.gmail.com>

On 2 February 2018 at 20:46, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Feb 2, 2018 at 1:43 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:

The query is actually hanging because one of the workers is in a small
loop where it iterates over the subplans searching for unfinished
plans, and it never comes out of the loop (it's a bug which I am yet
to fix). And it does not make sense to keep CHECK_FOR_INTERRUPTS in
each iteration; it's a small loop that does not pass control to any
other functions .

Uh, sounds like we'd better fix that bug.

The scenario is this : One of the workers w1 hasn't yet chosen the
first plan, and all the plans are already finished. So w1 has it's
node->as_whichplan equal to -1. So the below condition in
choose_next_subplan_for_worker() never becomes true for this worker :

if (pstate->pa_next_plan == node->as_whichplan)
{
/* We've tried everything! */
pstate->pa_next_plan = INVALID_SUBPLAN_INDEX;
LWLockRelease(&pstate->pa_lock);
return false;
}

What I think is : we should save the information about which plan we
started the search with, before the loop starts. So, we save the
starting plan value like this, before the loop starts:
initial_plan = pstate->pa_next_plan;

And then break out of the loop when we come back to to this initial plan :
if (initial_plan == pstate->pa_next_plan)
break;

Now, suppose it so happens that initial_plan is a non-partial plan.
And then when we wrap around to the first partial plan, we check
(initial_plan == pstate->pa_next_plan) which will never become true
because initial_plan is less than first_partial_plan.

So what I have done in the patch is : have a flag 'should_wrap_around'
indicating that we should not wrap around. This flag is true when
initial_plan is a non-partial plan, in which case we know that we will
have covered all plans by the time we reach the last plan. So break
from the loop if this flag is false, or if we have reached the initial
plan.

Attached is a patch that fixes this issue on the above lines.

The patch adds two new variables and always sets them but warp
around can happen at most once per call so I think it is enough
to arrange to stop at the wrap around time. Addition to that the
patch is forgetting the case of no partial plan. The loop can
overrun on pa_finished in the case.

Sorry, the patch works fine. But still the new variables don't
seem needed.

I think something like the following would work.

@@ -485,6 +485,13 @@ choose_next_subplan_for_worker(AppendState *node)
{
/* Loop back to first partial plan. */
pstate->pa_next_plan = append->first_partial_plan;
+
+      /*
+       * Arrange to bail out if we are trying to fetch the first partial
+       * plan
+       */
+      if (node->as_whichplan < append->first_partial_plan)
+        node->as_whichplan = append->first_partial_plan;
}
else

But I am not sure about this : while the workers are at it, why the
backend that is waiting for the workers does not come out of the wait
state with a SIGINT. I guess the same issue has been discussed in the
mail thread that you pointed.

Is it getting stuck here?

/*
* We can't finish transaction commit or abort until all of the workers
* have exited. This means, in particular, that we can't respond to
* interrupts at this stage.
*/
HOLD_INTERRUPTS();
WaitForParallelWorkersToExit(pcxt);
RESUME_INTERRUPTS();

Actually the backend is getting stuck in
choose_next_subplan_for_leader(), in LWLockAcquire(), waiting for the
hanging worker to release the pstate->pa_lock. I think there is
nothing wrong in this, because it is assumed that LWLock wait is going
to be for very short tiime, and because of this bug, the lwlock waits
forever.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

#10Amit Khandekar
amitdkhan.pg@gmail.com
In reply to: Kyotaro HORIGUCHI (#9)
Re: Query running for very long time (server hanged) with parallel append

On 6 February 2018 at 10:11, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

At Mon, 5 Feb 2018 15:29:27 +0530, Amit Khandekar <amitdkhan.pg@gmail.com> wrote in

Attached is a patch that fixes this issue on the above lines.

The patch adds two new variables and always sets them but warp
around can happen at most once per call so I think it is enough
to arrange to stop at the wrap around time. Addition to that the
patch is forgetting the case of no partial plan. The loop can
overrun on pa_finished in the case.

Sorry, the patch works fine. But still the new variables don't
seem needed.

True, I could have set initially as_whichplan to pa_next_plan, and
used as_whichplan instead of initial_plan. And, I could have used the
condition initial_plan >= append->first_partial_plan instead of
should_wrap_around. The reason I used these two variables is because I
thought the logic would be more reader-friendly. (Also,
should_wrap_around uses static values so using this variable avoids
determining the condition (initial_plan >= append->first_partial_plan)
at each iteration).

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

#11Rajkumar Raghuwanshi
rajkumar.raghuwanshi@enterprisedb.com
In reply to: Amit Khandekar (#7)
Re: Query running for very long time (server hanged) with parallel append

On Mon, Feb 5, 2018 at 3:29 PM, Amit Khandekar <amitdkhan.pg@gmail.com>
wrote:

Attached is a patch that fixes this issue on the above lines.

Patch applied cleanly and work fine for me. mentioned issue is not
reproducible now.

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation

#12Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Amit Khandekar (#10)
Re: Query running for very long time (server hanged) with parallel append

At Tue, 6 Feb 2018 11:56:27 +0530, Amit Khandekar <amitdkhan.pg@gmail.com> wrote in <CAJ3gD9fJhKBHn-fx6UzYZeLa0N8WnnZb2BsUTm_5A9kYsJgnww@mail.gmail.com>

On 6 February 2018 at 10:11, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

At Mon, 5 Feb 2018 15:29:27 +0530, Amit Khandekar <amitdkhan.pg@gmail.com> wrote in

Attached is a patch that fixes this issue on the above lines.

The patch adds two new variables and always sets them but warp
around can happen at most once per call so I think it is enough
to arrange to stop at the wrap around time. Addition to that the
patch is forgetting the case of no partial plan. The loop can
overrun on pa_finished in the case.

Sorry, the patch works fine. But still the new variables don't
seem needed.

True, I could have set initially as_whichplan to pa_next_plan, and
used as_whichplan instead of initial_plan. And, I could have used the
condition initial_plan >= append->first_partial_plan instead of
should_wrap_around. The reason I used these two variables is because I
thought the logic would be more reader-friendly. (Also,
should_wrap_around uses static values so using this variable avoids
determining the condition (initial_plan >= append->first_partial_plan)
at each iteration).

Thanks for the explanation. I'm fine with it. Well may I make
some comments on the patch?

- It seems to me that the if (!should_warp_around) block and else
block are better be transposed so that make the bail out code
closer to the exit condition check as the previous code
was. (In other was the second block should be if
(should_wrap...), not if (!should_warp..).

- The following comment needs a "the" before the "first", maybe.
 +			/* Loop back to first partial plan. */

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

#13Amit Khandekar
amitdkhan.pg@gmail.com
In reply to: Kyotaro HORIGUCHI (#12)
1 attachment(s)
Re: Query running for very long time (server hanged) with parallel append

On 6 February 2018 at 16:11, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

Thanks for the explanation. I'm fine with it. Well may I make
some comments on the patch?

Sure, always welcome.

- It seems to me that the if (!should_warp_around) block and else
block are better be transposed so that make the bail out code
closer to the exit condition check as the previous code
was. (In other was the second block should be if
(should_wrap...), not if (!should_warp..).

Yeah, I think it looks equally good that way, and like you said, the
current code does it that way. So in the attached patch, I have
swapped the two conditions.

- The following comment needs a "the" before the "first", maybe.
+                      /* Loop back to first partial plan. */

Now since we are not changing this line in the patch, I have avoided
that. But the committer can very well make this change, all though I
am not sure myself about the grammar.

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Attachments:

fix_hang_issue_v2.patchapplication/octet-stream; name=fix_hang_issue_v2.patchDownload
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 64a17fb..ca9f121 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -456,6 +456,8 @@ choose_next_subplan_for_worker(AppendState *node)
 {
 	ParallelAppendState *pstate = node->as_pstate;
 	Append	   *append = (Append *) node->ps.plan;
+	int			initial_plan;
+	bool		should_wrap_around;
 
 	/* Backward scan is not supported by parallel-aware plans */
 	Assert(ScanDirectionIsForward(node->ps.state->es_direction));
@@ -473,6 +475,14 @@ choose_next_subplan_for_worker(AppendState *node)
 		return false;
 	}
 
+	/*
+	 * If we are starting the search from a non-partial plan, no point in
+	 * looping back to the first partial plan: we would have covered all the
+	 * plans by the time we reach the last plan.
+	 */
+	initial_plan = pstate->pa_next_plan;
+	should_wrap_around = (initial_plan >= append->first_partial_plan);
+
 	/* Loop until we find a subplan to execute. */
 	while (pstate->pa_finished[pstate->pa_next_plan])
 	{
@@ -481,18 +491,18 @@ choose_next_subplan_for_worker(AppendState *node)
 			/* Advance to next plan. */
 			pstate->pa_next_plan++;
 		}
-		else if (append->first_partial_plan < node->as_nplans)
+		else if (should_wrap_around)
 		{
 			/* Loop back to first partial plan. */
 			pstate->pa_next_plan = append->first_partial_plan;
 		}
 		else
 		{
-			/* At last plan, no partial plans, arrange to bail out. */
-			pstate->pa_next_plan = node->as_whichplan;
+			/* Covered all plans, arrange to bail out. */
+			pstate->pa_next_plan = initial_plan;
 		}
 
-		if (pstate->pa_next_plan == node->as_whichplan)
+		if (pstate->pa_next_plan == initial_plan)
 		{
 			/* We've tried everything! */
 			pstate->pa_next_plan = INVALID_SUBPLAN_INDEX;
#14Robert Haas
robertmhaas@gmail.com
In reply to: Amit Khandekar (#13)
1 attachment(s)
Re: Query running for very long time (server hanged) with parallel append

On Tue, Feb 6, 2018 at 11:32 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:

Yeah, I think it looks equally good that way, and like you said, the
current code does it that way. So in the attached patch, I have
swapped the two conditions.

I prefer to avoid introducing 2 new variables and instead just prevent
the looping directly in the case where we started with a non-partial
plan.

See attached. Does this look OK?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

fix-hang-issue-rmh.patchapplication/octet-stream; name=fix-hang-issue-rmh.patchDownload
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 64a17fb032..640269564e 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -481,14 +481,18 @@ choose_next_subplan_for_worker(AppendState *node)
 			/* Advance to next plan. */
 			pstate->pa_next_plan++;
 		}
-		else if (append->first_partial_plan < node->as_nplans)
+		else if (append->first_partial_plan < node->as_nplans &&
+				 node->as_whichplan > append->first_partial_plan)
 		{
 			/* Loop back to first partial plan. */
 			pstate->pa_next_plan = append->first_partial_plan;
 		}
 		else
 		{
-			/* At last plan, no partial plans, arrange to bail out. */
+			/*
+			 * At last plan, and either there are no partial plans or we've
+			 * tried them all.  Arrange to bail out.
+			 */
 			pstate->pa_next_plan = node->as_whichplan;
 		}
 
#15Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Robert Haas (#14)
Re: Query running for very long time (server hanged) with parallel append

At Tue, 6 Feb 2018 13:50:28 -0500, Robert Haas <robertmhaas@gmail.com> wrote in <CA+TgmoYqdC+9U8mLYkUgM=CaBt6Pzz4R_YNboqDbW-LvUaHO+g@mail.gmail.com>

On Tue, Feb 6, 2018 at 11:32 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:

Yeah, I think it looks equally good that way, and like you said, the
current code does it that way. So in the attached patch, I have
swapped the two conditions.

I prefer to avoid introducing 2 new variables and instead just prevent
the looping directly in the case where we started with a non-partial
plan.

See attached. Does this look OK?

Ah, we can bail out when starting from the first partial plan.
It's a bit uneasy that pa_next_plan can be -1 but it looks
perfect to me.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

#16Amit Khandekar
amitdkhan.pg@gmail.com
In reply to: Kyotaro HORIGUCHI (#15)
1 attachment(s)
Re: Query running for very long time (server hanged) with parallel append

On 7 February 2018 at 07:30, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

At Tue, 6 Feb 2018 13:50:28 -0500, Robert Haas <robertmhaas@gmail.com> wrote in <CA+TgmoYqdC+9U8mLYkUgM=CaBt6Pzz4R_YNboqDbW-LvUaHO+g@mail.gmail.com>

On Tue, Feb 6, 2018 at 11:32 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:

Yeah, I think it looks equally good that way, and like you said, the
current code does it that way. So in the attached patch, I have
swapped the two conditions.

I prefer to avoid introducing 2 new variables and instead just prevent
the looping directly in the case where we started with a non-partial
plan.

See attached. Does this look OK?

Ah, we can bail out when starting from the first partial plan.

It's a bit uneasy that pa_next_plan can be -1

I also feel that way. In the new condition (node->as_whichplan >
append->first_partial_plan), there is an implicit assumption that
INVALID_SUBPLAN_INDEX is equal to -1. So in our issue, if
node->as_whichplan is -1, this condition becomes false and so the loop
breaks correctly. But like I said, it works correctly because
INVALID_SUBPLAN_INDEX is defined to be less than zero. I guess, this
assumption might be safe in the long term also.

But another thing is that, node->as_whichplan can point to some
non-partial plan because it is the value of the earlier chosen plan.
And in that case (node->as_whichplan > append->first_partial_plan) is
false, indicating we should not wrap around. But there still might be
unfinished partial plans, and the pa_next_plan might be pointing to,
say, a plan at the tail end. Here, it still makes sense to wrap around
and see if there are unfinished partial plans. But the above condition
would not allow us to wrap around.

For this, we should check whether we have started with a partial plan.
That is, use this condition like how I used : (initial_plan >=
append->first_partial_plan) . Or else, if we don't want to add a new
variable, set node->as_whichplan to pa_next_plan initially.

Also, the first condition is not necessary here :
-       else if (append->first_partial_plan < node->as_nplans)
+       else if (append->first_partial_plan < node->as_nplans &&
+                node->as_whichplan > append->first_partial_plan)
Just the fact that we are starting from a non-partial plan is enough
to determine that we should not wrap around, regardless of whether
there are any partial plans.
So we can just keep this single condition :
-       else if (append->first_partial_plan < node->as_nplans)
+       else if (node->as_whichplan > append->first_partial_plan)

Attached is the patch accordingly.

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Attachments:

fix_hang_issue_v3.patchapplication/octet-stream; name=fix_hang_issue_v3.patchDownload
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 64a17fb..264d8fe 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -473,6 +473,9 @@ choose_next_subplan_for_worker(AppendState *node)
 		return false;
 	}
 
+	/* Save the plan from which we are starting the search. */
+	node->as_whichplan = pstate->pa_next_plan;
+
 	/* Loop until we find a subplan to execute. */
 	while (pstate->pa_finished[pstate->pa_next_plan])
 	{
@@ -481,14 +484,17 @@ choose_next_subplan_for_worker(AppendState *node)
 			/* Advance to next plan. */
 			pstate->pa_next_plan++;
 		}
-		else if (append->first_partial_plan < node->as_nplans)
+		else if (node->as_whichplan > append->first_partial_plan)
 		{
 			/* Loop back to first partial plan. */
 			pstate->pa_next_plan = append->first_partial_plan;
 		}
 		else
 		{
-			/* At last plan, no partial plans, arrange to bail out. */
+			/*
+			 * At last plan, and either there are no partial plans or we've
+			 * tried them all.  Arrange to bail out.
+			 */
 			pstate->pa_next_plan = node->as_whichplan;
 		}
 
#17Robert Haas
robertmhaas@gmail.com
In reply to: Amit Khandekar (#16)
Re: Query running for very long time (server hanged) with parallel append

On Wed, Feb 7, 2018 at 2:11 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:

Attached is the patch accordingly.

OK, I see. That makes sense; committed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company