PATCH: pg_restore parallel-execution-deadlock issue

Started by Armin Schöffmannalmost 10 years ago7 messages
#1Armin Schöffmann
armin.schoeffmann@aegaeon.de
2 attachment(s)

worthy hackers,
I propose the below patches to parallels.c and pg_backup_utils.c fixing deadlocks in pg_restore (windows only) if running more than 2 parallel jobs.
This problem was reported by me earlier this year.
/messages/by-id/20160307161619.25731.78653@wrigleys.postgresql.org

- Winsock's "recv(...)" called in piperead() is a blocking read by default, therefor, signalizing termEvent as used in ShutdownWorkersHard() is not enough to make worker-threads go away.
We need a preceding shutdown(pipeWrite, SD_BOTH), first, to abort blocking IO in this case.
Otherwise, the main-thread will wait forever, if more than one additional worker is active (e.g. option -j3) and a premature EOF occurs in the input-file.

Findings in pg_backup_utils.c/ parallels.c, which could impact other tools, too:

- threads created with _beginthreadex need to be exited by either a "return exitcode" or "_endthreadex(exitcode)". It might be obsolete in fire-and-forget-scenarios, but it matters in other cases.
As of current, pg_backup_utils uses EndThread to retire additional worker-threads., which are spawned by _beginthreadex in parallel.c. The corresponding call for ExitThread would be CreateThread,
nevertheless, _beginthreadex is the correct choice here, as we do call-out into CRT and need to retain the thread-handle for after-death synchronization with the main-thread.
The thread-handle needs to be closed explicitly.

If this is not the correct place to discuss patches, I'd be glad if somebody can notify the tool's maintainer, to take a look into it.

best regards,
Armin Schöffmann.

--
Aegaeon technologies GmbH
phone: +49.941.8107344
fax: +49.941.8107356

Legal disclaimer:
http://aegaeon.de/disclaimer/email_all_int.txt

parallel.c

@@ -350,7 +350,8 @@ static void
 ShutdownWorkersHard(ParallelState *pstate)
 {
-#ifndef WIN32
+
 	int			i;

+#ifndef WIN32
signal(SIGPIPE, SIG_IGN);

@@ -367,4 +368,7 @@ ShutdownWorkersHard(ParallelState *pstate)
 	/* The workers monitor this event via checkAborting(). */
 	SetEvent(termEvent);
+
+	for (i = 0; i < pstate->numWorkers; i++)
+		shutdown(pstate->parallelSlot[i].pipeWrite, SD_BOTH);
 #endif
@@ -408,5 +412,8 @@ WaitForTerminatingWorkers(ParallelState *pstate)
 		for (j = 0; j < pstate->numWorkers; j++)
 			if (pstate->parallelSlot[j].hThread == hThread)
+			{
 				slot = &(pstate->parallelSlot[j]);
+				CloseHandle(hThread);
+			}

free(lpHandles);

pg_backup_utils.c

@@ -120,5 +120,5 @@ exit_nicely(int code)
 #ifdef WIN32
 	if (parallel_init_done && GetCurrentThreadId() != mainThreadId)
-		ExitThread(code);
+		_endthreadex(code);
 #endif

Attachments:

parallel.c.diffapplication/octet-stream; name=parallel.c.diffDownload
{\rtf1\ansi\ansicpg1252\deff0\deflang1031{\fonttbl{\f0\fnil\fcharset0 Courier New;}}
{\*\generator Msftedit 5.41.21.2510;}\viewkind4\uc1\pard\sl240\slmult1\f0\fs22 @@ -350,7 +350,8 @@ static void\par
 ShutdownWorkersHard(ParallelState *pstate)\par
 \{\par
-#ifndef WIN32\par
+\par
 \tab int\tab\tab\tab i;\par
 \par
+#ifndef WIN32\par
 \tab signal(SIGPIPE, SIG_IGN);\par
 \par
@@ -367,4 +368,7 @@ ShutdownWorkersHard(ParallelState *pstate)\par
 \tab /* The workers monitor this event via checkAborting(). */\par
 \tab SetEvent(termEvent);\par
+\par
+\tab for (i = 0; i < pstate->numWorkers; i++)\par
+\tab\tab shutdown(pstate->parallelSlot[i].pipeWrite, SD_BOTH);\par
 #endif\par
 \par
@@ -408,5 +412,8 @@ WaitForTerminatingWorkers(ParallelState *pstate)\par
 \tab\tab for (j = 0; j < pstate->numWorkers; j++)\par
 \tab\tab\tab if (pstate->parallelSlot[j].hThread == hThread)\par
+\tab\tab\tab\{\par
 \tab\tab\tab\tab slot = &(pstate->parallelSlot[j]);\par
+\tab\tab\tab\tab CloseHandle(hThread);\par
+\tab\tab\tab\}\par
 \par
 \tab\tab free(lpHandles);\par
}
pg_backup_utils.c.diffapplication/octet-stream; name=pg_backup_utils.c.diffDownload
{\rtf1\ansi\ansicpg1252\deff0\deflang1031{\fonttbl{\f0\fnil\fcharset0 Courier New;}}
{\*\generator Msftedit 5.41.21.2510;}\viewkind4\uc1\pard\f0\fs22 @@ -120,5 +120,5 @@ exit_nicely(int code)\par
 #ifdef WIN32\par
 \tab if (parallel_init_done && GetCurrentThreadId() != mainThreadId)\par
-\tab\tab ExitThread(code);\par
+\tab\tab _endthreadex(code);\par
 #endif\par
}
#2Michael Paquier
michael.paquier@gmail.com
In reply to: Armin Schöffmann (#1)
1 attachment(s)
Re: PATCH: pg_restore parallel-execution-deadlock issue

On Tue, Apr 5, 2016 at 9:28 AM, Armin Schöffmann
<armin.schoeffmann@aegaeon.de> wrote:

I propose the below patches to parallels.c and pg_backup_utils.c fixing deadlocks in pg_restore (windows only) if running more than 2 parallel jobs.
This problem was reported by me earlier this year.
/messages/by-id/20160307161619.25731.78653@wrigleys.postgresql.org

Yes, I recall that... It is one of the things that I have bookmarked
on my box and that I wanted to look at at some point.. Well now's the
time.

- Winsock's "recv(...)" called in piperead() is a blocking read by default, therefor, signalizing termEvent as used in ShutdownWorkersHard() is not enough to make worker-threads go away.
We need a preceding shutdown(pipeWrite, SD_BOTH), first, to abort blocking IO in this case.
Otherwise, the main-thread will wait forever, if more than one additional worker is active (e.g. option -j3) and a premature EOF occurs in the input-file.

    /* The workers monitor this event via checkAborting(). */
    SetEvent(termEvent);
+
+   /* Disable send and receive on the given socket */
+   for (i = 0; i < pstate->numWorkers; i++)
+       shutdown(pstate->parallelSlot[i].pipeWrite, SD_BOTH);
 #endif
Looking at this code, it is indeed tricky. We cannot just close the
sockets because of the blocking call emulated in WIN32's piperead
added in parallel.c, and it is necessary to be in line with the
termination event. This really meritates a comment in the code. I
added one in the patch attached.

Findings in pg_backup_utils.c/ parallels.c, which could impact other tools, too:
- threads created with _beginthreadex need to be exited by either a "return exitcode" or "_endthreadex(exitcode)". It might be obsolete in fire-and-forget-scenarios, but it matters in other cases.
As of current, pg_backup_utils uses EndThread to retire additional worker-threads., which are spawned by _beginthreadex in parallel.c. The corresponding call for ExitThread would be CreateThread,
nevertheless, _beginthreadex is the correct choice here, as we do call-out into CRT and need to retain the thread-handle for after-death synchronization with the main-thread.
The thread-handle needs to be closed explicitly.

This is as well explained here:
https://msdn.microsoft.com/en-us/library/kdzttdcb.aspx
"endthread and _endthreadex reclaim allocated thread resources and
then call ExitThread."

#ifdef WIN32
if (parallel_init_done && GetCurrentThreadId() != mainThreadId)
- ExitThread(code);
+ _endthreadex(code);
#endif
This is indeed the right thing to do per the docs if _beginthreadex
has been called to initialize it.

        for (j = 0; j < pstate->numWorkers; j++)
+       {
            if (pstate->parallelSlot[j].hThread == hThread)
+           {
                slot = &(pstate->parallelSlot[j]);
+               CloseHandle(hThread);
+           }
+       }
OK for closing the handle here. You are missing a cast to HANDLE here
actually or this code generates a warning.

If this is not the correct place to discuss patches, I'd be glad if somebody can notify the tool's maintainer, to take a look into it.

Here or -bugs are correct places to discuss such issues. People doing
from time to time work with Windows hang up on the two lists.
--
Michael

Attachments:

win32-dump-parallel.patchinvalid/octet-stream; name=win32-dump-parallel.patchDownload
diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c
index 9167294..7e73562 100644
--- a/src/bin/pg_dump/parallel.c
+++ b/src/bin/pg_dump/parallel.c
@@ -349,9 +349,9 @@ checkAborting(ArchiveHandle *AH)
 static void
 ShutdownWorkersHard(ParallelState *pstate)
 {
-#ifndef WIN32
 	int			i;
 
+#ifndef WIN32
 	signal(SIGPIPE, SIG_IGN);
 
 	/*
@@ -366,6 +366,14 @@ ShutdownWorkersHard(ParallelState *pstate)
 #else
 	/* The workers monitor this event via checkAborting(). */
 	SetEvent(termEvent);
+
+	/*
+	 * Disable send and receive on the given socket. Closing sockets is not
+	 * reliable here because piperead()'s WIN32 emulated here does a blocking
+	 * call of recv(), and this will stop the read done on them.
+	 */
+	for (i = 0; i < pstate->numWorkers; i++)
+		shutdown(pstate->parallelSlot[i].pipeWrite, SD_BOTH);
 #endif
 
 	WaitForTerminatingWorkers(pstate);
@@ -406,9 +414,19 @@ WaitForTerminatingWorkers(ParallelState *pstate)
 		hThread = lpHandles[ret - WAIT_OBJECT_0];
 
 		for (j = 0; j < pstate->numWorkers; j++)
+		{
 			if (pstate->parallelSlot[j].hThread == hThread)
+			{
 				slot = &(pstate->parallelSlot[j]);
 
+				/*
+				 * _endthreadex does not close the handle by itself, hence
+				 * do it here.
+				 */
+				CloseHandle((HANDLE) hThread);
+			}
+		}
+
 		free(lpHandles);
 #endif
 		Assert(slot);
diff --git a/src/bin/pg_dump/pg_backup_utils.c b/src/bin/pg_dump/pg_backup_utils.c
index 0aa13cd..0ef7e5e 100644
--- a/src/bin/pg_dump/pg_backup_utils.c
+++ b/src/bin/pg_dump/pg_backup_utils.c
@@ -119,7 +119,7 @@ exit_nicely(int code)
 
 #ifdef WIN32
 	if (parallel_init_done && GetCurrentThreadId() != mainThreadId)
-		ExitThread(code);
+		_endthreadex(code);
 #endif
 
 	exit(code);
#3Michael Paquier
michael.paquier@gmail.com
In reply to: Michael Paquier (#2)
Re: PATCH: pg_restore parallel-execution-deadlock issue

On Fri, Apr 8, 2016 at 2:24 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

On Tue, Apr 5, 2016 at 9:28 AM, Armin Schöffmann
<armin.schoeffmann@aegaeon.de> wrote:

If this is not the correct place to discuss patches, I'd be glad if somebody can notify the tool's maintainer, to take a look into it.

Here or -bugs are correct places to discuss such issues. People doing
from time to time work with Windows hang up on the two lists.

ea274b2 has changed the way disconnection is done is is now closing
both the read and write pipes. So you may want to retry if things get
better with the next round of minor releases.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#3)
Re: PATCH: pg_restore parallel-execution-deadlock issue

Michael Paquier <michael.paquier@gmail.com> writes:

ea274b2 has changed the way disconnection is done is is now closing
both the read and write pipes. So you may want to retry if things get
better with the next round of minor releases.

Hadn't paid attention to this thread before ...

It looks like there are still a few things we need to deal with before
considering Armin's submission resolved:

1. Armin proposes using "shutdown(pipeWrite, SD_BOTH)" where the code
committed this morning (df8d2d8c4) has "closesocket(pipeWrite)".
I'd prefer to leave it that way since it's the same as for the Unix case,
and Kyotaro-san says it works for him. Is there a reason we'd need
shutdown() instead?

2. Armin proposes that WaitForTerminatingWorkers needs to do CloseHandle()
on the various thread handles. That sounds plausible but I don't know
enough Windows programming to know if it really matters.

3. Should we replace ExitThread() with _endthreadex()? Again, it
seems plausible but I'm not the person to ask.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Amit Kapila
amit.kapila16@gmail.com
In reply to: Tom Lane (#4)
Re: PATCH: pg_restore parallel-execution-deadlock issue

On Fri, May 27, 2016 at 3:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Michael Paquier <michael.paquier@gmail.com> writes:

ea274b2 has changed the way disconnection is done is is now closing
both the read and write pipes. So you may want to retry if things get
better with the next round of minor releases.

Hadn't paid attention to this thread before ...

It looks like there are still a few things we need to deal with before
considering Armin's submission resolved:

1. Armin proposes using "shutdown(pipeWrite, SD_BOTH)" where the code
committed this morning (df8d2d8c4) has "closesocket(pipeWrite)".
I'd prefer to leave it that way since it's the same as for the Unix case,
and Kyotaro-san says it works for him. Is there a reason we'd need
shutdown() instead?

2. Armin proposes that WaitForTerminatingWorkers needs to do CloseHandle()
on the various thread handles. That sounds plausible but I don't know
enough Windows programming to know if it really matters.

3. Should we replace ExitThread() with _endthreadex()? Again, it
seems plausible but I'm not the person to ask.

I think point (2) and (3) are related because using _endthreadex won't
close the thread handle explicitly [1]https://msdn.microsoft.com/en-us/library/kdzttdcb.aspx Refer line "*_endthread* automatically closes the thread handle, whereas *_endthreadex* does not.".

[1]: https://msdn.microsoft.com/en-us/library/kdzttdcb.aspx Refer line "*_endthread* automatically closes the thread handle, whereas *_endthreadex* does not."
Refer line "*_endthread* automatically closes the thread handle, whereas
*_endthreadex* does not."

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#6Michael Paquier
michael.paquier@gmail.com
In reply to: Amit Kapila (#5)
Re: PATCH: pg_restore parallel-execution-deadlock issue

On Fri, May 27, 2016 at 4:07 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, May 27, 2016 at 3:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Michael Paquier <michael.paquier@gmail.com> writes:

ea274b2 has changed the way disconnection is done is is now closing
both the read and write pipes. So you may want to retry if things get
better with the next round of minor releases.

Hadn't paid attention to this thread before ...

1. Armin proposes using "shutdown(pipeWrite, SD_BOTH)" where the code
committed this morning (df8d2d8c4) has "closesocket(pipeWrite)".
I'd prefer to leave it that way since it's the same as for the Unix case,
and Kyotaro-san says it works for him. Is there a reason we'd need
shutdown() instead?

Hm, OK.

2. Armin proposes that WaitForTerminatingWorkers needs to do CloseHandle()
on the various thread handles. That sounds plausible but I don't know
enough Windows programming to know if it really matters.

3. Should we replace ExitThread() with _endthreadex()? Again, it
seems plausible but I'm not the person to ask.

I think point (2) and (3) are related because using _endthreadex won't close
the thread handle explicitly [1].

Yep.

[1] - https://msdn.microsoft.com/en-us/library/kdzttdcb.aspx
Refer line "_endthread automatically closes the thread handle, whereas
_endthreadex does not."

And the rest of the sentence:
Therefore, when you use _beginthread and _endthread, do not explicitly
close the thread handle by calling the Win32 CloseHandle API. This
behavior differs from the Win32 ExitThread API.

Personally I understand that as well as for the first part: when using
_beginthreadex and _endthreadex, be sure to call CloseHandle() to
explicitely close the thread handle.

And based on the second part if we use ExitThread after beginning a
thread with _beginthreadex we would get unreliable behavior. I guess
you don't need a patch? Because by looking again at this thread and
the windows docs what we have now is unpredictable.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#6)
Re: PATCH: pg_restore parallel-execution-deadlock issue

Michael Paquier <michael.paquier@gmail.com> writes:

On Fri, May 27, 2016 at 4:07 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, May 27, 2016 at 3:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

2. Armin proposes that WaitForTerminatingWorkers needs to do CloseHandle()
on the various thread handles. That sounds plausible but I don't know
enough Windows programming to know if it really matters.

3. Should we replace ExitThread() with _endthreadex()? Again, it
seems plausible but I'm not the person to ask.

I think point (2) and (3) are related because using _endthreadex won't close
the thread handle explicitly [1].

Yep.

OK, I pushed something based on that. It's untested by me but the
buildfarm should tell us if I broke anything too badly.

I believe we've now dealt with all the issues originally raised by
Armin, so I've marked this patch committed in the CF app.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers