Trap errors from streaming child in pg_basebackup to exit early

Started by Daniel Gustafssonover 4 years ago16 messages

daniel@yesql.se

over 4 years ago

1 attachment(s)

When using pg_basebackup with WAL streaming (-X stream), we have observed on a
number of times in production that the streaming child exited prematurely (to
no fault of the code it seems, most likely due to network middleboxes), which
cause the backup to fail but only after it has run to completion. On long
running backups this can consume a lot of time before it’s noticed.

By trapping the failure of the streaming process we can instead exit early to
allow the user to fix and/or restart the process.

The attached adds a SIGCHLD handler for Unix, and catch the returnvalue from
the Windows thread, in order to break out early from the main loop. It still
needs a test, and proper testing on Windows, but early feedback on the approach
would be appreciated.

--
Daniel Gustafsson https://vmware.com/

Attachments:

0001-Quick-exit-on-log-stream-child-exit-in-pg_basebackup.patchapplication/octet-stream; name=0001-Quick-exit-on-log-stream-child-exit-in-pg_basebackup.patch; x-unix-mode=0644Download

From c46d66f42d92f5d7c84f18ce62c892b3686f6745 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Mon, 23 Aug 2021 13:25:26 +0200
Subject: [PATCH] Quick exit on log stream child exit in pg_basebackup

If the log streaming child process (thread on Windows) dies during
backup then the whole backup will be aborted at the end of the
backup.  Instead, trap ungraceful termination of the log streaming
child and exit early.
---
 src/bin/pg_basebackup/pg_basebackup.c | 40 ++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 7296eb97d0..e0dd9f868b 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -174,6 +174,8 @@ static int	bgpipe[2] = {-1, -1};
 /* Handle to child process */
 static pid_t bgchild = -1;
 static bool in_log_streamer = false;
+/* State of child process */
+static sig_atomic_t bgchild_exited = false;
 
 /* End position for xlog streaming, empty string if unknown yet */
 static XLogRecPtr xlogendptr;
@@ -267,6 +269,16 @@ disconnect_atexit(void)
 		PQfinish(conn);
 }
 
+/*
+ * If the bgchild exits prematurely we can abort processing rather than wait
+ * till the backup has finished and error out at that time.
+ */
+static void
+sigchld_handler(SIGNAL_ARGS)
+{
+	bgchild_exited = true;
+}
+
 #ifndef WIN32
 /*
  * On windows, our background thread dies along with the process. But on
@@ -562,17 +574,28 @@ LogStreamerMain(logstreamer_param *param)
 											  stream.do_sync);
 
 	if (!ReceiveXlogStream(param->bgconn, &stream))
-
+	{
 		/*
 		 * Any errors will already have been reported in the function process,
 		 * but we need to tell the parent that we didn't shutdown in a nice
 		 * way.
 		 */
+#ifdef WIN32
+		/*
+		 * In order to signal the main thread of an ungraceful exit we
+		 * set the flag used on Unix to signal SIGCHLD.
+		 */
+		bgchild_exited = true;
+#endif
 		return 1;
+	}
 
 	if (!stream.walmethod->finish())
 	{
 		pg_log_error("could not finish writing WAL files: %m");
+#ifdef WIN32
+		bgchild_exited = true;
+#endif
 		return 1;
 	}
 
@@ -985,6 +1008,12 @@ ReceiveCopyData(PGconn *conn, WriteDataCallback callback,
 			exit(1);
 		}
 
+		if (bgchild_exited)
+		{
+			pg_log_error("log streamer child terminated unexpectedly");
+			exit(1);
+		}
+
 		(*callback) (r, copybuf, callback_data);
 
 		PQfreemem(copybuf);
@@ -2600,6 +2629,15 @@ main(int argc, char **argv)
 	}
 	atexit(disconnect_atexit);
 
+#ifndef WIN32
+	/*
+	 * Trap SIGCHLD to be able to handle the WAL stream process exiting. There
+	 * is no SIGCHLD on Windows, there we rely on the background thread setting
+	 * the signal variable on ungraceful exit.
+	 */
+	pqsignal(SIGCHLD, sigchld_handler);
+#endif
+
 	/*
 	 * Set umask so that directories/files are created with the same
 	 * permissions as directories/files in the source data directory.
-- 
2.24.3 (Apple Git-128)

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 4 years ago

In reply to: Daniel Gustafsson (#1)

Re: Trap errors from streaming child in pg_basebackup to exit early

On Thu, Aug 26, 2021 at 2:55 PM Daniel Gustafsson <daniel@yesql.se> wrote:

When using pg_basebackup with WAL streaming (-X stream), we have observed on a
number of times in production that the streaming child exited prematurely (to
no fault of the code it seems, most likely due to network middleboxes), which
cause the backup to fail but only after it has run to completion. On long
running backups this can consume a lot of time before it’s noticed.

Hm.

By trapping the failure of the streaming process we can instead exit early to
allow the user to fix and/or restart the process.

The attached adds a SIGCHLD handler for Unix, and catch the returnvalue from
the Windows thread, in order to break out early from the main loop. It still
needs a test, and proper testing on Windows, but early feedback on the approach
would be appreciated.

Here are some comments on the patch:
1) Do we need volatile keyword here to read the value of the variables
always from the memory?
+static volatile sig_atomic_t bgchild_exited = false;

2) Do we need #ifndef WIN32 ... #endif around sigchld_handler function
definition?

3) I'm not sure if the new value of bgchild_exited being set in the
child thread will reflect in the main process on Windows? But
theoretically, I can understand that the memory will be shared between
the main process thread and child thread.
#ifdef WIN32
/*
* In order to signal the main thread of an ungraceful exit we
* set the flag used on Unix to signal SIGCHLD.
*/
bgchild_exited = true;
#endif

4) How about "set the same flag that we use on Unix to signal
SIGCHLD." instead of "* set the flag used on Unix to signal
SIGCHLD."?

5) How about "background WAL receiver terminated unexpectedly" instead
of "log streamer child terminated unexpectedly"? This will be in sync
with the existing message "starting background WAL receiver". "log
streamer" is the word used internally in the code, user doesn't know
it with that name.

6) How about giving the exit code (like postmaster's reaper function
does) instead of just a message saying unexpected termination? It will
be useful to know for what reason the process exited. For Windows, we
can use GetExitCodeThread (I'm referring to the code around waitpid in
pg_basebackup) and for Unix we can use waitpid.

Regards,
Bharath Rupireddy.

Daniel Gustafsson

daniel@yesql.se

over 4 years ago

In reply to: Bharath Rupireddy (#2)

1 attachment(s)

Re: Trap errors from streaming child in pg_basebackup to exit early

On 30 Aug 2021, at 12:31, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:

Here are some comments on the patch:
1) Do we need volatile keyword here to read the value of the variables
always from the memory?
+static volatile sig_atomic_t bgchild_exited = false;

Yes, fixed.

2) Do we need #ifndef WIN32 ... #endif around sigchld_handler function
definition?

Ah yes, good point. Fixed.

3) I'm not sure if the new value of bgchild_exited being set in the
child thread will reflect in the main process on Windows? But
theoretically, I can understand that the memory will be shared between
the main process thread and child thread.

The child does not have it’s own copy of bgchild_exited.

4) How about "set the same flag that we use on Unix to signal
SIGCHLD." instead of "* set the flag used on Unix to signal
SIGCHLD."?

Fixed.

5) How about "background WAL receiver terminated unexpectedly" instead
of "log streamer child terminated unexpectedly"? This will be in sync
with the existing message "starting background WAL receiver". "log
streamer" is the word used internally in the code, user doesn't know
it with that name.

Good point, that’s better.

6) How about giving the exit code (like postmaster's reaper function
does) instead of just a message saying unexpected termination? It will
be useful to know for what reason the process exited. For Windows, we
can use GetExitCodeThread (I'm referring to the code around waitpid in
pg_basebackup) and for Unix we can use waitpid.

The rest of the program is doing exit(1) regardless of the failure of the
child/thread, so it seems more consistent to keep doing that for this class of
error as well.

A v2 with the above fixes is attached.

--
Daniel Gustafsson https://vmware.com/

Attachments:

v2-0001-Quick-exit-on-log-stream-child-exit-in-pg_basebac.patchapplication/octet-stream; name=v2-0001-Quick-exit-on-log-stream-child-exit-in-pg_basebac.patch; x-unix-mode=0644Download

From d31f4a268877f5551f9107e7110791228675e36a Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Mon, 23 Aug 2021 13:25:26 +0200
Subject: [PATCH v2] Quick exit on log stream child exit in pg_basebackup

If the log streaming child process (thread on Windows) dies during
backup then the whole backup will be aborted at the end of the
backup.  Instead, trap ungraceful termination of the log streaming
child and exit early.
---
 src/bin/pg_basebackup/pg_basebackup.c | 42 ++++++++++++++++++++++++++-
 1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 7296eb97d0..1a55c06834 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -174,6 +174,8 @@ static int	bgpipe[2] = {-1, -1};
 /* Handle to child process */
 static pid_t bgchild = -1;
 static bool in_log_streamer = false;
+/* State of child process */
+static volatile sig_atomic_t bgchild_exited = false;
 
 /* End position for xlog streaming, empty string if unknown yet */
 static XLogRecPtr xlogendptr;
@@ -268,6 +270,18 @@ disconnect_atexit(void)
 }
 
 #ifndef WIN32
+/*
+ * If the bgchild exits prematurely and raises a SIGCHLD signal, we can abort
+ * processing rather than wait till the backup has finished and error out at
+ * that time. On Windows we use a background thread which can communicate
+ * without the need for a signal handler.
+ */
+static void
+sigchld_handler(SIGNAL_ARGS)
+{
+	bgchild_exited = true;
+}
+
 /*
  * On windows, our background thread dies along with the process. But on
  * Unix, if we have started a subprocess, we want to kill it off so it
@@ -562,17 +576,28 @@ LogStreamerMain(logstreamer_param *param)
 											  stream.do_sync);
 
 	if (!ReceiveXlogStream(param->bgconn, &stream))
-
+	{
 		/*
 		 * Any errors will already have been reported in the function process,
 		 * but we need to tell the parent that we didn't shutdown in a nice
 		 * way.
 		 */
+#ifdef WIN32
+		/*
+		 * In order to signal the main thread of an ungraceful exit we
+		 * set the same flag that we use on Unix to signal SIGCHLD.
+		 */
+		bgchild_exited = true;
+#endif
 		return 1;
+	}
 
 	if (!stream.walmethod->finish())
 	{
 		pg_log_error("could not finish writing WAL files: %m");
+#ifdef WIN32
+		bgchild_exited = true;
+#endif
 		return 1;
 	}
 
@@ -985,6 +1010,12 @@ ReceiveCopyData(PGconn *conn, WriteDataCallback callback,
 			exit(1);
 		}
 
+		if (bgchild_exited)
+		{
+			pg_log_error("background WAL receiver terminated unexpectedly");
+			exit(1);
+		}
+
 		(*callback) (r, copybuf, callback_data);
 
 		PQfreemem(copybuf);
@@ -2600,6 +2631,15 @@ main(int argc, char **argv)
 	}
 	atexit(disconnect_atexit);
 
+#ifndef WIN32
+	/*
+	 * Trap SIGCHLD to be able to handle the WAL stream process exiting. There
+	 * is no SIGCHLD on Windows, there we rely on the background thread setting
+	 * the signal variable on ungraceful exit.
+	 */
+	pqsignal(SIGCHLD, sigchld_handler);
+#endif
+
 	/*
 	 * Set umask so that directories/files are created with the same
 	 * permissions as directories/files in the source data directory.
-- 
2.24.3 (Apple Git-128)

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 4 years ago

In reply to: Daniel Gustafsson (#3)

Re: Trap errors from streaming child in pg_basebackup to exit early

On Wed, Sep 1, 2021 at 1:56 PM Daniel Gustafsson <daniel@yesql.se> wrote:

A v2 with the above fixes is attached.

Thanks for the updated patch. Here are some comments:

1) Do we need to set bgchild = -1 before the exit(1); in the code
below so that we don't kill(bgchild, SIGTERM); unnecessarily in
kill_bgchild_atexit?
+ if (bgchild_exited)
+ {
+ pg_log_error("background WAL receiver terminated unexpectedly");
+ exit(1);
+ }
+

2) Missing "," after "On Windows, we use a ....."
+ * that time. On Windows we use a background thread which can communicate

3) How about "/* Flag to indicate whether or not child process exited
*/" instead of +/* State of child process */?

4) Instead of just exiting from the main pg_basebackup process when
the child WAL receiver dies, can't we think of restarting the child
process, probably with the WAL streaming position where it left off or
stream from the beginning? This way, the work that the main
pg_basebackup has done so far doesn't get wasted. I'm not sure if this
affects the pg_basebackup functionality. We can restart the child
process for 1 or 2 times, if it still dies, we can kill the main
pg_baasebackup process too. Thoughts?

Regards,
Bharath Rupireddy.

Daniel Gustafsson

daniel@yesql.se

over 4 years ago

In reply to: Bharath Rupireddy (#4)

1 attachment(s)

Re: Trap errors from streaming child in pg_basebackup to exit early

On 1 Sep 2021, at 12:28, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Sep 1, 2021 at 1:56 PM Daniel Gustafsson <daniel@yesql.se> wrote:

A v2 with the above fixes is attached.

Thanks for the updated patch. Here are some comments:

1) Do we need to set bgchild = -1 before the exit(1); in the code
below so that we don't kill(bgchild, SIGTERM); unnecessarily in
kill_bgchild_atexit?

Good point. We can also inspect bgchild_exited in kill_bgchild_atexit.

2) Missing "," after "On Windows, we use a ....."
+ * that time. On Windows we use a background thread which can communicate

3) How about "/* Flag to indicate whether or not child process exited
*/" instead of +/* State of child process */?

Fixed.

4) Instead of just exiting from the main pg_basebackup process when
the child WAL receiver dies, can't we think of restarting the child
process, probably with the WAL streaming position where it left off or
stream from the beginning? This way, the work that the main
pg_basebackup has done so far doesn't get wasted. I'm not sure if this
affects the pg_basebackup functionality. We can restart the child
process for 1 or 2 times, if it still dies, we can kill the main
pg_baasebackup process too. Thoughts?

I was toying with the idea, but I ended up not pursuing it. This error is well
into the “really shouldn’t happen, but can” territory and it’s quite likely
that some level of manual intervention is required to make it successfully
restart. I’m not convinced that adding complicated logic to restart (and even
more complicated tests to simulate and test it) will be worthwhile.

--
Daniel Gustafsson https://vmware.com/

Attachments:

v3-0001-Quick-exit-on-log-stream-child-exit-in-pg_basebac.patchapplication/octet-stream; name=v3-0001-Quick-exit-on-log-stream-child-exit-in-pg_basebac.patch; x-unix-mode=0644Download

From 370e658031ff99bcec6b0678def498c401268813 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Mon, 23 Aug 2021 13:25:26 +0200
Subject: [PATCH v3] Quick exit on log stream child exit in pg_basebackup

If the log streaming child process (thread on Windows) dies during
backup then the whole backup will be aborted at the end of the
backup.  Instead, trap ungraceful termination of the log streaming
child and exit early.
---
 src/bin/pg_basebackup/pg_basebackup.c | 44 +++++++++++++++++++++++++--
 1 file changed, 42 insertions(+), 2 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 7296eb97d0..7d95f72fdb 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -174,6 +174,8 @@ static int	bgpipe[2] = {-1, -1};
 /* Handle to child process */
 static pid_t bgchild = -1;
 static bool in_log_streamer = false;
+/* Flag to indicate if child process exited unexpectedly */
+static volatile sig_atomic_t bgchild_exited = false;
 
 /* End position for xlog streaming, empty string if unknown yet */
 static XLogRecPtr xlogendptr;
@@ -268,6 +270,18 @@ disconnect_atexit(void)
 }
 
 #ifndef WIN32
+/*
+ * If the bgchild exits prematurely and raises a SIGCHLD signal, we can abort
+ * processing rather than wait till the backup has finished and error out at
+ * that time. On Windows, we use a background thread which can communicate
+ * without the need for a signal handler.
+ */
+static void
+sigchld_handler(SIGNAL_ARGS)
+{
+	bgchild_exited = true;
+}
+
 /*
  * On windows, our background thread dies along with the process. But on
  * Unix, if we have started a subprocess, we want to kill it off so it
@@ -276,7 +290,7 @@ disconnect_atexit(void)
 static void
 kill_bgchild_atexit(void)
 {
-	if (bgchild > 0)
+	if (bgchild > 0 && !bgchild_exited)
 		kill(bgchild, SIGTERM);
 }
 #endif
@@ -562,17 +576,28 @@ LogStreamerMain(logstreamer_param *param)
 											  stream.do_sync);
 
 	if (!ReceiveXlogStream(param->bgconn, &stream))
-
+	{
 		/*
 		 * Any errors will already have been reported in the function process,
 		 * but we need to tell the parent that we didn't shutdown in a nice
 		 * way.
 		 */
+#ifdef WIN32
+		/*
+		 * In order to signal the main thread of an ungraceful exit we
+		 * set the same flag that we use on Unix to signal SIGCHLD.
+		 */
+		bgchild_exited = true;
+#endif
 		return 1;
+	}
 
 	if (!stream.walmethod->finish())
 	{
 		pg_log_error("could not finish writing WAL files: %m");
+#ifdef WIN32
+		bgchild_exited = true;
+#endif
 		return 1;
 	}
 
@@ -985,6 +1010,12 @@ ReceiveCopyData(PGconn *conn, WriteDataCallback callback,
 			exit(1);
 		}
 
+		if (bgchild_exited)
+		{
+			pg_log_error("background WAL receiver terminated unexpectedly");
+			exit(1);
+		}
+
 		(*callback) (r, copybuf, callback_data);
 
 		PQfreemem(copybuf);
@@ -2600,6 +2631,15 @@ main(int argc, char **argv)
 	}
 	atexit(disconnect_atexit);
 
+#ifndef WIN32
+	/*
+	 * Trap SIGCHLD to be able to handle the WAL stream process exiting. There
+	 * is no SIGCHLD on Windows, there we rely on the background thread setting
+	 * the signal variable on ungraceful exit.
+	 */
+	pqsignal(SIGCHLD, sigchld_handler);
+#endif
+
 	/*
 	 * Set umask so that directories/files are created with the same
 	 * permissions as directories/files in the source data directory.
-- 
2.24.3 (Apple Git-128)

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 4 years ago

In reply to: Daniel Gustafsson (#5)

Re: Trap errors from streaming child in pg_basebackup to exit early

On Fri, Sep 3, 2021 at 3:23 PM Daniel Gustafsson <daniel@yesql.se> wrote:

4) Instead of just exiting from the main pg_basebackup process when
the child WAL receiver dies, can't we think of restarting the child
process, probably with the WAL streaming position where it left off or
stream from the beginning? This way, the work that the main
pg_basebackup has done so far doesn't get wasted. I'm not sure if this
affects the pg_basebackup functionality. We can restart the child
process for 1 or 2 times, if it still dies, we can kill the main
pg_baasebackup process too. Thoughts?

I was toying with the idea, but I ended up not pursuing it. This error is well
into the “really shouldn’t happen, but can” territory and it’s quite likely
that some level of manual intervention is required to make it successfully
restart. I’m not convinced that adding complicated logic to restart (and even
more complicated tests to simulate and test it) will be worthwhile.

I withdraw my suggestion because I now feel that it's better not to
make it complex and let the user decide if at all the child process
exits abnormally.

I think we might still miss abnormal child thread exits on Windows
because we set bgchild_exited = true only if ReceiveXlogStream or
walmethod->finish() returns false. I'm not sure the parent thread on
Windows can detect a child's abnormal exit. Since there is no signal
mechanism on Windows, what the patch does is better to detect child
exit on two important functions failures.

Overall, the v3 patch looks good to me.

Regards,
Bharath Rupireddy.

Magnus Hagander

magnus@hagander.net

over 4 years ago

In reply to: Daniel Gustafsson (#5)

Re: Trap errors from streaming child in pg_basebackup to exit early

On Fri, Sep 3, 2021 at 11:53 AM Daniel Gustafsson <daniel@yesql.se> wrote:

On 1 Sep 2021, at 12:28, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Sep 1, 2021 at 1:56 PM Daniel Gustafsson <daniel@yesql.se> wrote:

A v2 with the above fixes is attached.

Thanks for the updated patch. Here are some comments:

1) Do we need to set bgchild = -1 before the exit(1); in the code
below so that we don't kill(bgchild, SIGTERM); unnecessarily in
kill_bgchild_atexit?

Good point. We can also inspect bgchild_exited in kill_bgchild_atexit.

2) Missing "," after "On Windows, we use a ....."
+ * that time. On Windows we use a background thread which can communicate

3) How about "/* Flag to indicate whether or not child process exited
*/" instead of +/* State of child process */?

Fixed.

4) Instead of just exiting from the main pg_basebackup process when
the child WAL receiver dies, can't we think of restarting the child
process, probably with the WAL streaming position where it left off or
stream from the beginning? This way, the work that the main
pg_basebackup has done so far doesn't get wasted. I'm not sure if this
affects the pg_basebackup functionality. We can restart the child
process for 1 or 2 times, if it still dies, we can kill the main
pg_baasebackup process too. Thoughts?

I was toying with the idea, but I ended up not pursuing it. This error is well
into the “really shouldn’t happen, but can” territory and it’s quite likely
that some level of manual intervention is required to make it successfully
restart. I’m not convinced that adding complicated logic to restart (and even
more complicated tests to simulate and test it) will be worthwhile.

I think the restart scenario while nice, definitely means moving the
goalposts quite far. Let's get this detection in first at least, and
then we can always consider that a separate patch in the future.

Might be worth noting in one of the comments the difference in
behaviour if the backend process/thread *crashes* -- that is, on Unix
it will be detected via the signal handler and on Windows the whole
process including the main thread will die, so there is nothing to
detect.

Other places in the code just refers to the background process as "the
background process". The term "WAL receiver" is new from this patch.
While I agree it's in many ways a better term, I think (1) we should
try to be consistent, and (2) maybe use a different term than "WAL
receiver" specifically because we have a backend component with that
name.

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

Daniel Gustafsson

daniel@yesql.se

over 4 years ago

In reply to: Magnus Hagander (#7)

1 attachment(s)

Re: Trap errors from streaming child in pg_basebackup to exit early

On 28 Sep 2021, at 15:48, Magnus Hagander <magnus@hagander.net> wrote:

Might be worth noting in one of the comments the difference in
behaviour if the backend process/thread *crashes* -- that is, on Unix
it will be detected via the signal handler and on Windows the whole
process including the main thread will die, so there is nothing to
detect.

Good point, done.

Other places in the code just refers to the background process as "the
background process". The term "WAL receiver" is new from this patch.
While I agree it's in many ways a better term, I think (1) we should
try to be consistent, and (2) maybe use a different term than "WAL
receiver" specifically because we have a backend component with that
name.

Looking at the user-facing messaging we have before this patch, there is a bit
of variability:

On UNIX:

pg_log_error("could not create pipe for background process: %m");
pg_log_error("could not create background process: %m");
pg_log_info("could not send command to background pipe: %m");
pg_log_error("could not wait for child process: %m");

On Windows:

pg_log_error("could not create background thread: %m");
pg_log_error("could not get child thread exit status: %m");
pg_log_error("could not wait for child thread: %m");
pg_log_error("child thread exited with error %u", ..);

On Both:

pg_log_info("starting background WAL receiver");
pg_log_info("waiting for background process to finish streaming ...");

So there is one mention of a background WAL receiver already in there, but it's
pretty inconsistent as to what we call it. For now I've changed the messaging
in this patch to say "background process", leaving making this all consistent
for a follow-up patch.

The attached fixes the above, as well as the typo mentioned off-list and is
rebased on top of todays HEAD.

--
Daniel Gustafsson https://vmware.com/

Attachments:

v4-0001-Quick-exit-on-log-stream-child-exit-in-pg_basebac.patchapplication/octet-stream; name=v4-0001-Quick-exit-on-log-stream-child-exit-in-pg_basebac.patch; x-unix-mode=0644Download

From ea4b916b08aa05d67405f4cffa9dc7a6708ac507 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Wed, 29 Sep 2021 13:13:40 +0200
Subject: [PATCH v4] Quick exit on log stream child exit in pg_basebackup

If the log streaming child process (thread on Windows) dies during
backup then the whole backup will be aborted at the end of the
backup.  Instead, trap ungraceful termination of the log streaming
child and exit early.

Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Magnus Hagander <magnus@hagander.net>
Discussion: https://postgr.es/m/0F69E282-97F9-4DB7-8D6D-F927AA6340C8@yesql.se
Discussion: https://postgr.es/m/VI1PR83MB0189818B82C19059CB62E26199A89@VI1PR83MB0189.EURPRD83.prod.outlook.com
---
 src/bin/pg_basebackup/pg_basebackup.c | 47 +++++++++++++++++++++++++--
 1 file changed, 45 insertions(+), 2 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 669aa207a3..74aa5e9e4d 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -174,6 +174,8 @@ static int	bgpipe[2] = {-1, -1};
 /* Handle to child process */
 static pid_t bgchild = -1;
 static bool in_log_streamer = false;
+/* Flag to indicate if child process exited unexpectedly */
+static volatile sig_atomic_t bgchild_exited = false;
 
 /* End position for xlog streaming, empty string if unknown yet */
 static XLogRecPtr xlogendptr;
@@ -268,6 +270,18 @@ disconnect_atexit(void)
 }
 
 #ifndef WIN32
+/*
+ * If the bgchild exits prematurely and raises a SIGCHLD signal, we can abort
+ * processing rather than wait until the backup has finished and error out at
+ * that time. On Windows, we use a background thread which can communicate
+ * without the need for a signal handler.
+ */
+static void
+sigchld_handler(SIGNAL_ARGS)
+{
+	bgchild_exited = true;
+}
+
 /*
  * On windows, our background thread dies along with the process. But on
  * Unix, if we have started a subprocess, we want to kill it off so it
@@ -276,7 +290,7 @@ disconnect_atexit(void)
 static void
 kill_bgchild_atexit(void)
 {
-	if (bgchild > 0)
+	if (bgchild > 0 && !bgchild_exited)
 		kill(bgchild, SIGTERM);
 }
 #endif
@@ -562,17 +576,28 @@ LogStreamerMain(logstreamer_param *param)
 											  stream.do_sync);
 
 	if (!ReceiveXlogStream(param->bgconn, &stream))
-
+	{
 		/*
 		 * Any errors will already have been reported in the function process,
 		 * but we need to tell the parent that we didn't shutdown in a nice
 		 * way.
 		 */
+#ifdef WIN32
+		/*
+		 * In order to signal the main thread of an ungraceful exit we
+		 * set the same flag that we use on Unix to signal SIGCHLD.
+		 */
+		bgchild_exited = true;
+#endif
 		return 1;
+	}
 
 	if (!stream.walmethod->finish())
 	{
 		pg_log_error("could not finish writing WAL files: %m");
+#ifdef WIN32
+		bgchild_exited = true;
+#endif
 		return 1;
 	}
 
@@ -980,6 +1005,12 @@ ReceiveCopyData(PGconn *conn, WriteDataCallback callback,
 			exit(1);
 		}
 
+		if (bgchild_exited)
+		{
+			pg_log_error("background process terminated unexpectedly");
+			exit(1);
+		}
+
 		(*callback) (r, copybuf, callback_data);
 
 		PQfreemem(copybuf);
@@ -2595,6 +2626,18 @@ main(int argc, char **argv)
 	}
 	atexit(disconnect_atexit);
 
+#ifndef WIN32
+	/*
+	 * Trap SIGCHLD to be able to handle the WAL stream process exiting. There
+	 * is no SIGCHLD on Windows, there we rely on the background thread setting
+	 * the signal variable on unexpected but graceful exit. If the WAL stream
+	 * thread crashes on Windows it will bring down the entire process as it's
+	 * a thread, so there is nothing to catch should that happen. A crash on
+	 * UNIX will be caught by the signal handler.
+	 */
+	pqsignal(SIGCHLD, sigchld_handler);
+#endif
+
 	/*
 	 * Set umask so that directories/files are created with the same
 	 * permissions as directories/files in the source data directory.
-- 
2.24.3 (Apple Git-128)

Masahiko Sawada

sawada.mshk@gmail.com

about 4 years ago

In reply to: Daniel Gustafsson (#8)

Re: Trap errors from streaming child in pg_basebackup to exit early

On Wed, Sep 29, 2021 at 8:18 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 28 Sep 2021, at 15:48, Magnus Hagander <magnus@hagander.net> wrote:

Might be worth noting in one of the comments the difference in
behaviour if the backend process/thread *crashes* -- that is, on Unix
it will be detected via the signal handler and on Windows the whole
process including the main thread will die, so there is nothing to
detect.

Good point, done.

Other places in the code just refers to the background process as "the
background process". The term "WAL receiver" is new from this patch.
While I agree it's in many ways a better term, I think (1) we should
try to be consistent, and (2) maybe use a different term than "WAL
receiver" specifically because we have a backend component with that
name.

Looking at the user-facing messaging we have before this patch, there is a bit
of variability:

On UNIX:

pg_log_error("could not create pipe for background process: %m");
pg_log_error("could not create background process: %m");
pg_log_info("could not send command to background pipe: %m");
pg_log_error("could not wait for child process: %m");

On Windows:

pg_log_error("could not create background thread: %m");
pg_log_error("could not get child thread exit status: %m");
pg_log_error("could not wait for child thread: %m");
pg_log_error("child thread exited with error %u", ..);

On Both:

pg_log_info("starting background WAL receiver");
pg_log_info("waiting for background process to finish streaming ...");

So there is one mention of a background WAL receiver already in there, but it's
pretty inconsistent as to what we call it. For now I've changed the messaging
in this patch to say "background process", leaving making this all consistent
for a follow-up patch.

The attached fixes the above, as well as the typo mentioned off-list and is
rebased on top of todays HEAD.

Thank you for working on this issue.

The patch looks good to me but there is one minor comment:

--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -174,6 +174,8 @@ static int bgpipe[2] = {-1, -1};
 /* Handle to child process */
 static pid_t bgchild = -1;
 static bool in_log_streamer = false;
+/* Flag to indicate if child process exited unexpectedly */
+static volatile sig_atomic_t bgchild_exited = false;

It's better to have a new line before the new comment.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

#10

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

about 4 years ago

In reply to: Daniel Gustafsson (#8)

Re: Trap errors from streaming child in pg_basebackup to exit early

On Wed, Sep 29, 2021 at 4:48 PM Daniel Gustafsson <daniel@yesql.se> wrote:

Other places in the code just refers to the background process as "the
background process". The term "WAL receiver" is new from this patch.
While I agree it's in many ways a better term, I think (1) we should
try to be consistent, and (2) maybe use a different term than "WAL
receiver" specifically because we have a backend component with that
name.

Looking at the user-facing messaging we have before this patch, there is a bit
of variability:

On UNIX:

pg_log_error("could not create pipe for background process: %m");
pg_log_error("could not create background process: %m");
pg_log_info("could not send command to background pipe: %m");
pg_log_error("could not wait for child process: %m");

On Windows:

pg_log_error("could not create background thread: %m");
pg_log_error("could not get child thread exit status: %m");
pg_log_error("could not wait for child thread: %m");
pg_log_error("child thread exited with error %u", ..);

On Both:

pg_log_info("starting background WAL receiver");
pg_log_info("waiting for background process to finish streaming ...");

So there is one mention of a background WAL receiver already in there, but it's
pretty inconsistent as to what we call it. For now I've changed the messaging
in this patch to say "background process", leaving making this all consistent
for a follow-up patch.

The attached fixes the above, as well as the typo mentioned off-list and is
rebased on top of todays HEAD.

The documentation [1]s stream of pg_basebackup specifies it as a "second
replication connection". Also, I see that the pg_receivewal.c using
the following message:
if (db_name)
{
pg_log_error("replication connection using slot \"%s\" is
unexpectedly database specific",
replication_slot);
exit(1);

We can use something like "stream replication connection" or
"background replication connection" or "background process/thread for
replication". Otherwise just "background process" on Unix and
"background thread" on Windows look fine to me. If others are okay, we
can remove the "WAL receiver" and use it consistently.

[1]: s stream
s
stream

Stream write-ahead log data while the backup is being taken. This
method will open a second connection to the server and start streaming
the write-ahead log in parallel while running the backup. Therefore,
it will require two replication connections not just one. As long as
the client can keep up with the write-ahead log data, using this
method requires no extra write-ahead logs to be saved on the source
server.

When tar format is used, the write-ahead log files will be written to
a separate file named pg_wal.tar (if the server is a version earlier
than 10, the file will be named pg_xlog.tar).

This value is the default.

Regards,
Bharath Rupireddy.

#11

Michael Paquier

michael@paquier.xyz

almost 4 years ago

In reply to: Daniel Gustafsson (#8)

Re: Trap errors from streaming child in pg_basebackup to exit early

On Wed, Sep 29, 2021 at 01:18:40PM +0200, Daniel Gustafsson wrote:

So there is one mention of a background WAL receiver already in there, but it's
pretty inconsistent as to what we call it. For now I've changed the messaging
in this patch to say "background process", leaving making this all consistent
for a follow-up patch.

The attached fixes the above, as well as the typo mentioned off-list and is
rebased on top of todays HEAD.

I have been looking a bit at this patch, and did some tests on Windows
to find out that this is able to catch the failure of the thread
streaming the WAL segments in pg_basebackup, avoiding a completion of
the base backup, while HEAD waits until the backup finishes. Testing
this scenario is actually simple by issuing pg_terminate_backend() on
the WAL sender that streams the WAL with START_REPLICATION, while
throttling the base backup.

Could you add a test to automate this scenario? As far as I can see,
something like the following should be stable even for Windows:
1) Run a pg_basebackup in the background with IPC::Run, using
--max-rate with a minimal value to slow down the base backup, for slow
machines. 013_crash_restart.pl does that as one example with $killme.
2) Find out the WAL sender doing START_REPLICATION in the backend, and
issue pg_terminate_backend() on it.
3) Use a variant of pump_until() on the pg_basebackup process and
check after one or more failure patterns. We should refactor this
part, actually. If this new test uses the same logic, that would make
three tests doing that with 022_crash_temp_files.pl and
013_crash_restart.pl. The CI should be fine to provide any feedback
with the test in place, though I am fine to test things also in my
box.
--
Michael

#12

Daniel Gustafsson

daniel@yesql.se

almost 4 years ago

In reply to: Michael Paquier (#11)

3 attachment(s)

Re: Trap errors from streaming child in pg_basebackup to exit early

On 16 Feb 2022, at 08:27, Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Sep 29, 2021 at 01:18:40PM +0200, Daniel Gustafsson wrote:

So there is one mention of a background WAL receiver already in there, but it's
pretty inconsistent as to what we call it. For now I've changed the messaging
in this patch to say "background process", leaving making this all consistent
for a follow-up patch.

The attached fixes the above, as well as the typo mentioned off-list and is
rebased on top of todays HEAD.

I have been looking a bit at this patch, and did some tests on Windows
to find out that this is able to catch the failure of the thread
streaming the WAL segments in pg_basebackup, avoiding a completion of
the base backup, while HEAD waits until the backup finishes. Testing
this scenario is actually simple by issuing pg_terminate_backend() on
the WAL sender that streams the WAL with START_REPLICATION, while
throttling the base backup.

Great, thanks!

Could you add a test to automate this scenario? As far as I can see,
something like the following should be stable even for Windows:
1) Run a pg_basebackup in the background with IPC::Run, using
--max-rate with a minimal value to slow down the base backup, for slow
machines. 013_crash_restart.pl does that as one example with $killme.
2) Find out the WAL sender doing START_REPLICATION in the backend, and
issue pg_terminate_backend() on it.
3) Use a variant of pump_until() on the pg_basebackup process and
check after one or more failure patterns. We should refactor this
part, actually. If this new test uses the same logic, that would make
three tests doing that with 022_crash_temp_files.pl and
013_crash_restart.pl. The CI should be fine to provide any feedback
with the test in place, though I am fine to test things also in my
box.

This is good idea, I was going in a different direction earlier with a test but
this is cleaner. The attached 0001 refactors pump_until; 0002 fixes a trivial
spelling error found while hacking; and 0003 is the previous patch complete
with a test that passes on Cirrus CI.

--
Daniel Gustafsson https://vmware.com/

Attachments:

v5-0001-Add-function-to-pump-IPC-process-until-string-mat.patchapplication/octet-stream; name=v5-0001-Add-function-to-pump-IPC-process-until-string-mat.patch; x-unix-mode=0644Download

From a7fe7a3a1ab49b0f56a6f7f3d4410347d10e5621 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Thu, 17 Feb 2022 23:25:45 +0100
Subject: [PATCH v5 1/3] Add function to pump IPC process until string match

Refactor the recovery tests to not carry a local duplicated copy of
the pump_until function which pumps a process until a defined string
is seen on a stream. This reduces duplication, and is a preparation
for another patch which will also use this functionality.
---
 src/test/perl/PostgreSQL/Test/Utils.pm      | 23 +++++++++++
 src/test/recovery/t/013_crash_restart.pl    | 46 +++++----------------
 src/test/recovery/t/022_crash_temp_files.pl | 45 ++++----------------
 3 files changed, 41 insertions(+), 73 deletions(-)

diff --git a/src/test/perl/PostgreSQL/Test/Utils.pm b/src/test/perl/PostgreSQL/Test/Utils.pm
index 31e2b0315e..cdbc1d6b84 100644
--- a/src/test/perl/PostgreSQL/Test/Utils.pm
+++ b/src/test/perl/PostgreSQL/Test/Utils.pm
@@ -74,6 +74,7 @@ our @EXPORT = qw(
   system_log
   run_log
   run_command
+  pump_until
 
   command_ok
   command_fails
@@ -465,6 +466,28 @@ sub run_command
 
 =pod
 
+=item pump_until(proc, timeout, stream, until)
+
+Pump until string is matched on the specified stream, or timeout occurs.
+
+=cut
+
+sub pump_until
+{
+	my ($proc, $timeout, $stream, $until) = @_;
+	$proc->pump_nb();
+	while (1)
+	{
+		last if $$stream =~ /$until/;
+		return 0 if ($timeout->is_expired);
+		return 0 if (not $proc->pumpable());
+		$proc->pump();
+	}
+	return 1;
+}
+
+=pod
+
 =item generate_ascii_string(from_char, to_char)
 
 Generate a string made of the given range of ASCII characters.
diff --git a/src/test/recovery/t/013_crash_restart.pl b/src/test/recovery/t/013_crash_restart.pl
index 3b740eb6f3..be31de37c5 100644
--- a/src/test/recovery/t/013_crash_restart.pl
+++ b/src/test/recovery/t/013_crash_restart.pl
@@ -71,7 +71,7 @@ CREATE TABLE alive(status text);
 INSERT INTO alive VALUES($$committed-before-sigquit$$);
 SELECT pg_backend_pid();
 ];
-ok(pump_until($killme, \$killme_stdout, qr/[[:digit:]]+[\r\n]$/m),
+ok(pump_until($killme, $psql_timeout, \$killme_stdout, qr/[[:digit:]]+[\r\n]$/m),
 	'acquired pid for SIGQUIT');
 my $pid = $killme_stdout;
 chomp($pid);
@@ -83,7 +83,7 @@ $killme_stdin .= q[
 BEGIN;
 INSERT INTO alive VALUES($$in-progress-before-sigquit$$) RETURNING status;
 ];
-ok(pump_until($killme, \$killme_stdout, qr/in-progress-before-sigquit/m),
+ok(pump_until($killme, $psql_timeout, \$killme_stdout, qr/in-progress-before-sigquit/m),
 	'inserted in-progress-before-sigquit');
 $killme_stdout = '';
 $killme_stderr = '';
@@ -96,7 +96,7 @@ $monitor_stdin .= q[
 SELECT $$psql-connected$$;
 SELECT pg_sleep(3600);
 ];
-ok(pump_until($monitor, \$monitor_stdout, qr/psql-connected/m),
+ok(pump_until($monitor, $psql_timeout, \$monitor_stdout, qr/psql-connected/m),
 	'monitor connected');
 $monitor_stdout = '';
 $monitor_stderr = '';
@@ -113,6 +113,7 @@ SELECT 1;
 ];
 ok( pump_until(
 		$killme,
+		$psql_timeout,
 		\$killme_stderr,
 		qr/WARNING:  terminating connection because of crash of another server process|server closed the connection unexpectedly|connection to server was lost/m
 	),
@@ -126,6 +127,7 @@ $killme->finish;
 # sending.
 ok( pump_until(
 		$monitor,
+		$psql_timeout,
 		\$monitor_stderr,
 		qr/WARNING:  terminating connection because of crash of another server process|server closed the connection unexpectedly|connection to server was lost/m
 	),
@@ -148,7 +150,7 @@ $monitor->run();
 $killme_stdin .= q[
 SELECT pg_backend_pid();
 ];
-ok(pump_until($killme, \$killme_stdout, qr/[[:digit:]]+[\r\n]$/m),
+ok(pump_until($killme, $psql_timeout, \$killme_stdout, qr/[[:digit:]]+[\r\n]$/m),
 	"acquired pid for SIGKILL");
 $pid = $killme_stdout;
 chomp($pid);
@@ -161,7 +163,7 @@ INSERT INTO alive VALUES($$committed-before-sigkill$$) RETURNING status;
 BEGIN;
 INSERT INTO alive VALUES($$in-progress-before-sigkill$$) RETURNING status;
 ];
-ok(pump_until($killme, \$killme_stdout, qr/in-progress-before-sigkill/m),
+ok(pump_until($killme, $psql_timeout, \$killme_stdout, qr/in-progress-before-sigkill/m),
 	'inserted in-progress-before-sigkill');
 $killme_stdout = '';
 $killme_stderr = '';
@@ -173,7 +175,7 @@ $monitor_stdin .= q[
 SELECT $$psql-connected$$;
 SELECT pg_sleep(3600);
 ];
-ok(pump_until($monitor, \$monitor_stdout, qr/psql-connected/m),
+ok(pump_until($monitor, $psql_timeout, \$monitor_stdout, qr/psql-connected/m),
 	'monitor connected');
 $monitor_stdout = '';
 $monitor_stderr = '';
@@ -191,6 +193,7 @@ SELECT 1;
 ];
 ok( pump_until(
 		$killme,
+		$psql_timeout,
 		\$killme_stderr,
 		qr/server closed the connection unexpectedly|connection to server was lost/m
 	),
@@ -202,6 +205,7 @@ $killme->finish;
 # sending.
 ok( pump_until(
 		$monitor,
+		$psql_timeout,
 		\$monitor_stderr,
 		qr/WARNING:  terminating connection because of crash of another server process|server closed the connection unexpectedly|connection to server was lost/m
 	),
@@ -240,34 +244,4 @@ is( $node->safe_psql(
 
 $node->stop();
 
-# Pump until string is matched, or timeout occurs
-sub pump_until
-{
-	my ($proc, $stream, $untl) = @_;
-	$proc->pump_nb();
-	while (1)
-	{
-		last if $$stream =~ /$untl/;
-		if ($psql_timeout->is_expired)
-		{
-			diag("aborting wait: program timed out");
-			diag("stream contents: >>", $$stream, "<<");
-			diag("pattern searched for: ", $untl);
-
-			return 0;
-		}
-		if (not $proc->pumpable())
-		{
-			diag("aborting wait: program died");
-			diag("stream contents: >>", $$stream, "<<");
-			diag("pattern searched for: ", $untl);
-
-			return 0;
-		}
-		$proc->pump();
-	}
-	return 1;
-
-}
-
 done_testing();
diff --git a/src/test/recovery/t/022_crash_temp_files.pl b/src/test/recovery/t/022_crash_temp_files.pl
index 6ab3092874..49dd86e848 100644
--- a/src/test/recovery/t/022_crash_temp_files.pl
+++ b/src/test/recovery/t/022_crash_temp_files.pl
@@ -57,7 +57,7 @@ my $killme = IPC::Run::start(
 $killme_stdin .= q[
 SELECT pg_backend_pid();
 ];
-ok(pump_until($killme, \$killme_stdout, qr/[[:digit:]]+[\r\n]$/m),
+ok(pump_until($killme, $psql_timeout, \$killme_stdout, qr/[[:digit:]]+[\r\n]$/m),
 	'acquired pid for SIGKILL');
 my $pid = $killme_stdout;
 chomp($pid);
@@ -86,7 +86,7 @@ BEGIN;
 INSERT INTO tab_crash (a) VALUES(1);
 SELECT $$insert-tuple-to-lock-next-insert$$;
 ];
-pump_until($killme2, \$killme_stdout2, qr/insert-tuple-to-lock-next-insert/m);
+pump_until($killme2, $psql_timeout, \$killme_stdout2, qr/insert-tuple-to-lock-next-insert/m);
 $killme_stdout2 = '';
 $killme_stderr2 = '';
 
@@ -99,7 +99,7 @@ BEGIN;
 SELECT $$in-progress-before-sigkill$$;
 INSERT INTO tab_crash (a) SELECT i FROM generate_series(1, 5000) s(i);
 ];
-ok(pump_until($killme, \$killme_stdout, qr/in-progress-before-sigkill/m),
+ok(pump_until($killme, $psql_timeout, \$killme_stdout, qr/in-progress-before-sigkill/m),
 	'insert in-progress-before-sigkill');
 $killme_stdout = '';
 $killme_stderr = '';
@@ -121,7 +121,7 @@ END; $c$;
 SELECT $$insert-tuple-lock-waiting$$;
 ];
 
-pump_until($killme2, \$killme_stdout2, qr/insert-tuple-lock-waiting/m);
+pump_until($killme2, $psql_timeout, \$killme_stdout2, qr/insert-tuple-lock-waiting/m);
 $killme_stdout2 = '';
 $killme_stderr2 = '';
 
@@ -158,7 +158,7 @@ $killme->run();
 $killme_stdin .= q[
 SELECT pg_backend_pid();
 ];
-ok(pump_until($killme, \$killme_stdout, qr/[[:digit:]]+[\r\n]$/m),
+ok(pump_until($killme, $psql_timeout, \$killme_stdout, qr/[[:digit:]]+[\r\n]$/m),
 	'acquired pid for SIGKILL');
 $pid = $killme_stdout;
 chomp($pid);
@@ -175,7 +175,7 @@ BEGIN;
 INSERT INTO tab_crash (a) VALUES(1);
 SELECT $$insert-tuple-to-lock-next-insert$$;
 ];
-pump_until($killme2, \$killme_stdout2, qr/insert-tuple-to-lock-next-insert/m);
+pump_until($killme2, $psql_timeout, \$killme_stdout2, qr/insert-tuple-to-lock-next-insert/m);
 $killme_stdout2 = '';
 $killme_stderr2 = '';
 
@@ -188,7 +188,7 @@ BEGIN;
 SELECT $$in-progress-before-sigkill$$;
 INSERT INTO tab_crash (a) SELECT i FROM generate_series(1, 5000) s(i);
 ];
-ok(pump_until($killme, \$killme_stdout, qr/in-progress-before-sigkill/m),
+ok(pump_until($killme, $psql_timeout, \$killme_stdout, qr/in-progress-before-sigkill/m),
 	'insert in-progress-before-sigkill');
 $killme_stdout = '';
 $killme_stderr = '';
@@ -210,7 +210,7 @@ END; $c$;
 SELECT $$insert-tuple-lock-waiting$$;
 ];
 
-pump_until($killme2, \$killme_stdout2, qr/insert-tuple-lock-waiting/m);
+pump_until($killme2, $psql_timeout, \$killme_stdout2, qr/insert-tuple-lock-waiting/m);
 $killme_stdout2 = '';
 $killme_stderr2 = '';
 
@@ -242,33 +242,4 @@ is( $node->safe_psql(
 
 $node->stop();
 
-# Pump until string is matched, or timeout occurs
-sub pump_until
-{
-	my ($proc, $stream, $untl) = @_;
-	$proc->pump_nb();
-	while (1)
-	{
-		last if $$stream =~ /$untl/;
-		if ($psql_timeout->is_expired)
-		{
-			diag("aborting wait: program timed out");
-			diag("stream contents: >>", $$stream, "<<");
-			diag("pattern searched for: ", $untl);
-
-			return 0;
-		}
-		if (not $proc->pumpable())
-		{
-			diag("aborting wait: program died");
-			diag("stream contents: >>", $$stream, "<<");
-			diag("pattern searched for: ", $untl);
-
-			return 0;
-		}
-		$proc->pump();
-	}
-	return 1;
-}
-
 done_testing();
-- 
2.24.3 (Apple Git-128)

v5-0002-Remove-duplicated-word-in-comment.patchapplication/octet-stream; name=v5-0002-Remove-duplicated-word-in-comment.patch; x-unix-mode=0644Download

From 3f40040472667d0fb7f25e3bbcfa8c46bba51465 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Thu, 17 Feb 2022 23:28:33 +0100
Subject: [PATCH v5 2/3] Remove duplicated word in comment

---
 src/test/recovery/t/013_crash_restart.pl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/test/recovery/t/013_crash_restart.pl b/src/test/recovery/t/013_crash_restart.pl
index be31de37c5..3976e339c0 100644
--- a/src/test/recovery/t/013_crash_restart.pl
+++ b/src/test/recovery/t/013_crash_restart.pl
@@ -28,7 +28,7 @@ my $node = PostgreSQL::Test::Cluster->new('primary');
 $node->init(allows_streaming => 1);
 $node->start();
 
-# by default PostgreSQL::Test::Cluster doesn't doesn't restart after a crash
+# by default PostgreSQL::Test::Cluster doesn't restart after a crash
 $node->safe_psql(
 	'postgres',
 	q[ALTER SYSTEM SET restart_after_crash = 1;
-- 
2.24.3 (Apple Git-128)

v5-0003-Quick-exit-on-log-stream-child-exit-in-pg_basebac.patchapplication/octet-stream; name=v5-0003-Quick-exit-on-log-stream-child-exit-in-pg_basebac.patch; x-unix-mode=0644Download

From 6a254cd4831d64e6296907265b60fcdf86f5771a Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Thu, 17 Feb 2022 23:22:43 +0100
Subject: [PATCH v5 3/3] Quick exit on log stream child exit in pg_basebackup

If the log streaming child process (thread on Windows) dies during
backup then the whole backup will be aborted at the end of the
backup.  Instead, trap ungraceful termination of the log streaming
child and exit early.  This also adds a TAP test for simulating this
by terminating the responsible backend.

Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Magnus Hagander <magnus@hagander.net>
Discussion: https://postgr.es/m/0F69E282-97F9-4DB7-8D6D-F927AA6340C8@yesql.se
Discussion: https://postgr.es/m/VI1PR83MB0189818B82C19059CB62E26199A89@VI1PR83MB0189.EURPRD83.prod.outlook.com
---
 src/bin/pg_basebackup/pg_basebackup.c        | 47 +++++++++++++++++++-
 src/bin/pg_basebackup/t/010_pg_basebackup.pl | 31 +++++++++++++
 2 files changed, 76 insertions(+), 2 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 08b07d5a06..ac289331c8 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -174,6 +174,8 @@ static int	bgpipe[2] = {-1, -1};
 /* Handle to child process */
 static pid_t bgchild = -1;
 static bool in_log_streamer = false;
+/* Flag to indicate if child process exited unexpectedly */
+static volatile sig_atomic_t bgchild_exited = false;
 
 /* End position for xlog streaming, empty string if unknown yet */
 static XLogRecPtr xlogendptr;
@@ -277,6 +279,18 @@ disconnect_atexit(void)
 }
 
 #ifndef WIN32
+/*
+ * If the bgchild exits prematurely and raises a SIGCHLD signal, we can abort
+ * processing rather than wait until the backup has finished and error out at
+ * that time. On Windows, we use a background thread which can communicate
+ * without the need for a signal handler.
+ */
+static void
+sigchld_handler(SIGNAL_ARGS)
+{
+	bgchild_exited = true;
+}
+
 /*
  * On windows, our background thread dies along with the process. But on
  * Unix, if we have started a subprocess, we want to kill it off so it
@@ -285,7 +299,7 @@ disconnect_atexit(void)
 static void
 kill_bgchild_atexit(void)
 {
-	if (bgchild > 0)
+	if (bgchild > 0 && !bgchild_exited)
 		kill(bgchild, SIGTERM);
 }
 #endif
@@ -572,17 +586,28 @@ LogStreamerMain(logstreamer_param *param)
 											  stream.do_sync);
 
 	if (!ReceiveXlogStream(param->bgconn, &stream))
-
+	{
 		/*
 		 * Any errors will already have been reported in the function process,
 		 * but we need to tell the parent that we didn't shutdown in a nice
 		 * way.
 		 */
+#ifdef WIN32
+		/*
+		 * In order to signal the main thread of an ungraceful exit we
+		 * set the same flag that we use on Unix to signal SIGCHLD.
+		 */
+		bgchild_exited = true;
+#endif
 		return 1;
+	}
 
 	if (!stream.walmethod->finish())
 	{
 		pg_log_error("could not finish writing WAL files: %m");
+#ifdef WIN32
+		bgchild_exited = true;
+#endif
 		return 1;
 	}
 
@@ -1126,6 +1151,12 @@ ReceiveCopyData(PGconn *conn, WriteDataCallback callback,
 			exit(1);
 		}
 
+		if (bgchild_exited)
+		{
+			pg_log_error("background process terminated unexpectedly");
+			exit(1);
+		}
+
 		(*callback) (r, copybuf, callback_data);
 
 		PQfreemem(copybuf);
@@ -2874,6 +2905,18 @@ main(int argc, char **argv)
 	}
 	atexit(disconnect_atexit);
 
+#ifndef WIN32
+	/*
+	 * Trap SIGCHLD to be able to handle the WAL stream process exiting. There
+	 * is no SIGCHLD on Windows, there we rely on the background thread setting
+	 * the signal variable on unexpected but graceful exit. If the WAL stream
+	 * thread crashes on Windows it will bring down the entire process as it's
+	 * a thread, so there is nothing to catch should that happen. A crash on
+	 * UNIX will be caught by the signal handler.
+	 */
+	pqsignal(SIGCHLD, sigchld_handler);
+#endif
+
 	/*
 	 * Set umask so that directories/files are created with the same
 	 * permissions as directories/files in the source data directory.
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 8c70e5b32b..42d42e65c2 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -778,4 +778,35 @@ SKIP:
 	rmtree("$tempdir/backup_gzip3");
 }
 
+# Test background stream process terminating before the basebackup has
+# finished, the main process should exit gracefully with an error message on
+# stderr.
+$node->safe_psql('postgres',
+	q{CREATE TABLE t AS SELECT a FROM generate_series(1,10000) AS a;});
+
+my $sigchld_bb_timeout = IPC::Run::timer(60);
+my ($sigchld_bb_stdin, $sigchld_bb_stdout, $sigchld_bb_stderr) = ('', '', '');
+my $sigchld_bb = IPC::Run::start(
+	[
+		@pg_basebackup_defs, '-X', 'stream', '-D', "$tempdir/sigchld",
+		'-r', '32', '-d', $node->connstr('postgres')
+	],
+	'<',
+	\$sigchld_bb_stdin,
+	'>',
+	\$sigchld_bb_stdout,
+	'2>',
+	\$sigchld_bb_stderr,
+	$sigchld_bb_timeout);
+
+is($node->poll_query_until('postgres',
+	"SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE " .
+	"application_name = '010_pg_basebackup.pl' AND wait_event = 'WalSenderMain' " .
+	"AND backend_type = 'walsender'"), "1", "Walsender killed");
+
+ok(pump_until($sigchld_bb, $sigchld_bb_timeout, \$sigchld_bb_stderr,
+  qr/background process terminated unexpectedly/),
+  'background process exit message');
+$sigchld_bb->finish();
+
 done_testing();
-- 
2.24.3 (Apple Git-128)

#13

Michael Paquier

michael@paquier.xyz

almost 4 years ago

In reply to: Daniel Gustafsson (#12)

Re: Trap errors from streaming child in pg_basebackup to exit early

On Fri, Feb 18, 2022 at 10:00:43PM +0100, Daniel Gustafsson wrote:

This is good idea, I was going in a different direction earlier with a test but
this is cleaner. The attached 0001 refactors pump_until; 0002 fixes a trivial
spelling error found while hacking; and 0003 is the previous patch complete
with a test that passes on Cirrus CI.

This looks rather sane to me, and I can confirm that this passes
the CI and a manual run of MSVC tests with my own box.

+is($node->poll_query_until('postgres',
+   "SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE " .
+   "application_name = '010_pg_basebackup.pl' AND wait_event =
'WalSenderMain' " .
+   "AND backend_type = 'walsender'"), "1", "Walsender killed");
If you do that, don't you have a risk to kill the WAL sender doing the
BASE_BACKUP?  That could falsify the test.  It seems to me that it
would be safer to add a check on query ~ 'START_REPLICATION' or
something like that.

- diag("aborting wait: program timed out");
- diag("stream contents: >>", $$stream, "<<");
- diag("pattern searched for: ", $untl);
Keeping some of this information around would be useful for
debugging in the refactored routine.

+my $sigchld_bb = IPC::Run::start(
+   [
+       @pg_basebackup_defs, '-X', 'stream', '-D', "$tempdir/sigchld",
+       '-r', '32', '-d', $node->connstr('postgres')
+   ],
I would recommend the use of long options here as a matter to
self-document what this does, and add a comment explaining why
--max-rate is preferable, mainly for fast machines.
--
Michael

#14

Daniel Gustafsson

daniel@yesql.se

almost 4 years ago

In reply to: Michael Paquier (#13)

3 attachment(s)

Re: Trap errors from streaming child in pg_basebackup to exit early

On 21 Feb 2022, at 03:03, Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Feb 18, 2022 at 10:00:43PM +0100, Daniel Gustafsson wrote:

This is good idea, I was going in a different direction earlier with a test but
this is cleaner. The attached 0001 refactors pump_until; 0002 fixes a trivial
spelling error found while hacking; and 0003 is the previous patch complete
with a test that passes on Cirrus CI.

This looks rather sane to me, and I can confirm that this passes
the CI and a manual run of MSVC tests with my own box.

Great, thanks!

+is($node->poll_query_until('postgres',
+   "SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE " .
+   "application_name = '010_pg_basebackup.pl' AND wait_event =
'WalSenderMain' " .
+   "AND backend_type = 'walsender'"), "1", "Walsender killed");
If you do that, don't you have a risk to kill the WAL sender doing the
BASE_BACKUP?  That could falsify the test.  It seems to me that it
would be safer to add a check on query ~ 'START_REPLICATION' or
something like that.

I don't think there's a risk, but I've added the check on query as well since
it also makes it more readable.

- diag("aborting wait: program timed out");
- diag("stream contents: >>", $$stream, "<<");
- diag("pattern searched for: ", $untl);
Keeping some of this information around would be useful for
debugging in the refactored routine.

Maybe, but we don't really have diag output anywhere in the modules or the
tests so I didn't see much of a precedent for keeping it. Inspectig the repo I
think we can remove two more in pg_rewind, which I just started a thread for.

+my $sigchld_bb = IPC::Run::start(
+   [
+       @pg_basebackup_defs, '-X', 'stream', '-D', "$tempdir/sigchld",
+       '-r', '32', '-d', $node->connstr('postgres')
+   ],
I would recommend the use of long options here as a matter to
self-document what this does, and add a comment explaining why
--max-rate is preferable, mainly for fast machines.

Fair enough, done.

--
Daniel Gustafsson https://vmware.com/

Attachments:

v6-0001-Add-function-to-pump-IPC-process-until-string-mat.patchapplication/octet-stream; name=v6-0001-Add-function-to-pump-IPC-process-until-string-mat.patch; x-unix-mode=0644Download

From 9674935ac0655df38fea1e5123a4a9ffb6a526b6 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Thu, 17 Feb 2022 23:25:45 +0100
Subject: [PATCH v6 1/3] Add function to pump IPC process until string match

Refactor the recovery tests to not carry a local duplicated copy of
the pump_until function which pumps a process until a defined string
is seen on a stream. This reduces duplication, and is a preparation
for another patch which will also use this functionality.
---
 src/test/perl/PostgreSQL/Test/Utils.pm      | 23 +++++++++++
 src/test/recovery/t/013_crash_restart.pl    | 46 +++++----------------
 src/test/recovery/t/022_crash_temp_files.pl | 45 ++++----------------
 3 files changed, 41 insertions(+), 73 deletions(-)

diff --git a/src/test/perl/PostgreSQL/Test/Utils.pm b/src/test/perl/PostgreSQL/Test/Utils.pm
index 2c0c72f57a..46cd746796 100644
--- a/src/test/perl/PostgreSQL/Test/Utils.pm
+++ b/src/test/perl/PostgreSQL/Test/Utils.pm
@@ -73,6 +73,7 @@ our @EXPORT = qw(
   system_log
   run_log
   run_command
+  pump_until
 
   command_ok
   command_fails
@@ -408,6 +409,28 @@ sub run_command
 
 =pod
 
+=item pump_until(proc, timeout, stream, until)
+
+Pump until string is matched on the specified stream, or timeout occurs.
+
+=cut
+
+sub pump_until
+{
+	my ($proc, $timeout, $stream, $until) = @_;
+	$proc->pump_nb();
+	while (1)
+	{
+		last if $$stream =~ /$until/;
+		return 0 if ($timeout->is_expired);
+		return 0 if (not $proc->pumpable());
+		$proc->pump();
+	}
+	return 1;
+}
+
+=pod
+
 =item generate_ascii_string(from_char, to_char)
 
 Generate a string made of the given range of ASCII characters.
diff --git a/src/test/recovery/t/013_crash_restart.pl b/src/test/recovery/t/013_crash_restart.pl
index 3b740eb6f3..be31de37c5 100644
--- a/src/test/recovery/t/013_crash_restart.pl
+++ b/src/test/recovery/t/013_crash_restart.pl
@@ -71,7 +71,7 @@ CREATE TABLE alive(status text);
 INSERT INTO alive VALUES($$committed-before-sigquit$$);
 SELECT pg_backend_pid();
 ];
-ok(pump_until($killme, \$killme_stdout, qr/[[:digit:]]+[\r\n]$/m),
+ok(pump_until($killme, $psql_timeout, \$killme_stdout, qr/[[:digit:]]+[\r\n]$/m),
 	'acquired pid for SIGQUIT');
 my $pid = $killme_stdout;
 chomp($pid);
@@ -83,7 +83,7 @@ $killme_stdin .= q[
 BEGIN;
 INSERT INTO alive VALUES($$in-progress-before-sigquit$$) RETURNING status;
 ];
-ok(pump_until($killme, \$killme_stdout, qr/in-progress-before-sigquit/m),
+ok(pump_until($killme, $psql_timeout, \$killme_stdout, qr/in-progress-before-sigquit/m),
 	'inserted in-progress-before-sigquit');
 $killme_stdout = '';
 $killme_stderr = '';
@@ -96,7 +96,7 @@ $monitor_stdin .= q[
 SELECT $$psql-connected$$;
 SELECT pg_sleep(3600);
 ];
-ok(pump_until($monitor, \$monitor_stdout, qr/psql-connected/m),
+ok(pump_until($monitor, $psql_timeout, \$monitor_stdout, qr/psql-connected/m),
 	'monitor connected');
 $monitor_stdout = '';
 $monitor_stderr = '';
@@ -113,6 +113,7 @@ SELECT 1;
 ];
 ok( pump_until(
 		$killme,
+		$psql_timeout,
 		\$killme_stderr,
 		qr/WARNING:  terminating connection because of crash of another server process|server closed the connection unexpectedly|connection to server was lost/m
 	),
@@ -126,6 +127,7 @@ $killme->finish;
 # sending.
 ok( pump_until(
 		$monitor,
+		$psql_timeout,
 		\$monitor_stderr,
 		qr/WARNING:  terminating connection because of crash of another server process|server closed the connection unexpectedly|connection to server was lost/m
 	),
@@ -148,7 +150,7 @@ $monitor->run();
 $killme_stdin .= q[
 SELECT pg_backend_pid();
 ];
-ok(pump_until($killme, \$killme_stdout, qr/[[:digit:]]+[\r\n]$/m),
+ok(pump_until($killme, $psql_timeout, \$killme_stdout, qr/[[:digit:]]+[\r\n]$/m),
 	"acquired pid for SIGKILL");
 $pid = $killme_stdout;
 chomp($pid);
@@ -161,7 +163,7 @@ INSERT INTO alive VALUES($$committed-before-sigkill$$) RETURNING status;
 BEGIN;
 INSERT INTO alive VALUES($$in-progress-before-sigkill$$) RETURNING status;
 ];
-ok(pump_until($killme, \$killme_stdout, qr/in-progress-before-sigkill/m),
+ok(pump_until($killme, $psql_timeout, \$killme_stdout, qr/in-progress-before-sigkill/m),
 	'inserted in-progress-before-sigkill');
 $killme_stdout = '';
 $killme_stderr = '';
@@ -173,7 +175,7 @@ $monitor_stdin .= q[
 SELECT $$psql-connected$$;
 SELECT pg_sleep(3600);
 ];
-ok(pump_until($monitor, \$monitor_stdout, qr/psql-connected/m),
+ok(pump_until($monitor, $psql_timeout, \$monitor_stdout, qr/psql-connected/m),
 	'monitor connected');
 $monitor_stdout = '';
 $monitor_stderr = '';
@@ -191,6 +193,7 @@ SELECT 1;
 ];
 ok( pump_until(
 		$killme,
+		$psql_timeout,
 		\$killme_stderr,
 		qr/server closed the connection unexpectedly|connection to server was lost/m
 	),
@@ -202,6 +205,7 @@ $killme->finish;
 # sending.
 ok( pump_until(
 		$monitor,
+		$psql_timeout,
 		\$monitor_stderr,
 		qr/WARNING:  terminating connection because of crash of another server process|server closed the connection unexpectedly|connection to server was lost/m
 	),
@@ -240,34 +244,4 @@ is( $node->safe_psql(
 
 $node->stop();
 
-# Pump until string is matched, or timeout occurs
-sub pump_until
-{
-	my ($proc, $stream, $untl) = @_;
-	$proc->pump_nb();
-	while (1)
-	{
-		last if $$stream =~ /$untl/;
-		if ($psql_timeout->is_expired)
-		{
-			diag("aborting wait: program timed out");
-			diag("stream contents: >>", $$stream, "<<");
-			diag("pattern searched for: ", $untl);
-
-			return 0;
-		}
-		if (not $proc->pumpable())
-		{
-			diag("aborting wait: program died");
-			diag("stream contents: >>", $$stream, "<<");
-			diag("pattern searched for: ", $untl);
-
-			return 0;
-		}
-		$proc->pump();
-	}
-	return 1;
-
-}
-
 done_testing();
diff --git a/src/test/recovery/t/022_crash_temp_files.pl b/src/test/recovery/t/022_crash_temp_files.pl
index 6ab3092874..49dd86e848 100644
--- a/src/test/recovery/t/022_crash_temp_files.pl
+++ b/src/test/recovery/t/022_crash_temp_files.pl
@@ -57,7 +57,7 @@ my $killme = IPC::Run::start(
 $killme_stdin .= q[
 SELECT pg_backend_pid();
 ];
-ok(pump_until($killme, \$killme_stdout, qr/[[:digit:]]+[\r\n]$/m),
+ok(pump_until($killme, $psql_timeout, \$killme_stdout, qr/[[:digit:]]+[\r\n]$/m),
 	'acquired pid for SIGKILL');
 my $pid = $killme_stdout;
 chomp($pid);
@@ -86,7 +86,7 @@ BEGIN;
 INSERT INTO tab_crash (a) VALUES(1);
 SELECT $$insert-tuple-to-lock-next-insert$$;
 ];
-pump_until($killme2, \$killme_stdout2, qr/insert-tuple-to-lock-next-insert/m);
+pump_until($killme2, $psql_timeout, \$killme_stdout2, qr/insert-tuple-to-lock-next-insert/m);
 $killme_stdout2 = '';
 $killme_stderr2 = '';
 
@@ -99,7 +99,7 @@ BEGIN;
 SELECT $$in-progress-before-sigkill$$;
 INSERT INTO tab_crash (a) SELECT i FROM generate_series(1, 5000) s(i);
 ];
-ok(pump_until($killme, \$killme_stdout, qr/in-progress-before-sigkill/m),
+ok(pump_until($killme, $psql_timeout, \$killme_stdout, qr/in-progress-before-sigkill/m),
 	'insert in-progress-before-sigkill');
 $killme_stdout = '';
 $killme_stderr = '';
@@ -121,7 +121,7 @@ END; $c$;
 SELECT $$insert-tuple-lock-waiting$$;
 ];
 
-pump_until($killme2, \$killme_stdout2, qr/insert-tuple-lock-waiting/m);
+pump_until($killme2, $psql_timeout, \$killme_stdout2, qr/insert-tuple-lock-waiting/m);
 $killme_stdout2 = '';
 $killme_stderr2 = '';
 
@@ -158,7 +158,7 @@ $killme->run();
 $killme_stdin .= q[
 SELECT pg_backend_pid();
 ];
-ok(pump_until($killme, \$killme_stdout, qr/[[:digit:]]+[\r\n]$/m),
+ok(pump_until($killme, $psql_timeout, \$killme_stdout, qr/[[:digit:]]+[\r\n]$/m),
 	'acquired pid for SIGKILL');
 $pid = $killme_stdout;
 chomp($pid);
@@ -175,7 +175,7 @@ BEGIN;
 INSERT INTO tab_crash (a) VALUES(1);
 SELECT $$insert-tuple-to-lock-next-insert$$;
 ];
-pump_until($killme2, \$killme_stdout2, qr/insert-tuple-to-lock-next-insert/m);
+pump_until($killme2, $psql_timeout, \$killme_stdout2, qr/insert-tuple-to-lock-next-insert/m);
 $killme_stdout2 = '';
 $killme_stderr2 = '';
 
@@ -188,7 +188,7 @@ BEGIN;
 SELECT $$in-progress-before-sigkill$$;
 INSERT INTO tab_crash (a) SELECT i FROM generate_series(1, 5000) s(i);
 ];
-ok(pump_until($killme, \$killme_stdout, qr/in-progress-before-sigkill/m),
+ok(pump_until($killme, $psql_timeout, \$killme_stdout, qr/in-progress-before-sigkill/m),
 	'insert in-progress-before-sigkill');
 $killme_stdout = '';
 $killme_stderr = '';
@@ -210,7 +210,7 @@ END; $c$;
 SELECT $$insert-tuple-lock-waiting$$;
 ];
 
-pump_until($killme2, \$killme_stdout2, qr/insert-tuple-lock-waiting/m);
+pump_until($killme2, $psql_timeout, \$killme_stdout2, qr/insert-tuple-lock-waiting/m);
 $killme_stdout2 = '';
 $killme_stderr2 = '';
 
@@ -242,33 +242,4 @@ is( $node->safe_psql(
 
 $node->stop();
 
-# Pump until string is matched, or timeout occurs
-sub pump_until
-{
-	my ($proc, $stream, $untl) = @_;
-	$proc->pump_nb();
-	while (1)
-	{
-		last if $$stream =~ /$untl/;
-		if ($psql_timeout->is_expired)
-		{
-			diag("aborting wait: program timed out");
-			diag("stream contents: >>", $$stream, "<<");
-			diag("pattern searched for: ", $untl);
-
-			return 0;
-		}
-		if (not $proc->pumpable())
-		{
-			diag("aborting wait: program died");
-			diag("stream contents: >>", $$stream, "<<");
-			diag("pattern searched for: ", $untl);
-
-			return 0;
-		}
-		$proc->pump();
-	}
-	return 1;
-}
-
 done_testing();
-- 
2.24.3 (Apple Git-128)

v6-0002-Remove-duplicated-word-in-comment.patchapplication/octet-stream; name=v6-0002-Remove-duplicated-word-in-comment.patch; x-unix-mode=0644Download

From 4cd2b9492b4a9f9713941cf54b0b66b9a9368583 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Thu, 17 Feb 2022 23:28:33 +0100
Subject: [PATCH v6 2/3] Remove duplicated word in comment

---
 src/test/recovery/t/013_crash_restart.pl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/test/recovery/t/013_crash_restart.pl b/src/test/recovery/t/013_crash_restart.pl
index be31de37c5..3976e339c0 100644
--- a/src/test/recovery/t/013_crash_restart.pl
+++ b/src/test/recovery/t/013_crash_restart.pl
@@ -28,7 +28,7 @@ my $node = PostgreSQL::Test::Cluster->new('primary');
 $node->init(allows_streaming => 1);
 $node->start();
 
-# by default PostgreSQL::Test::Cluster doesn't doesn't restart after a crash
+# by default PostgreSQL::Test::Cluster doesn't restart after a crash
 $node->safe_psql(
 	'postgres',
 	q[ALTER SYSTEM SET restart_after_crash = 1;
-- 
2.24.3 (Apple Git-128)

v6-0003-Quick-exit-on-log-stream-child-exit-in-pg_basebac.patchapplication/octet-stream; name=v6-0003-Quick-exit-on-log-stream-child-exit-in-pg_basebac.patch; x-unix-mode=0644Download

From dbf24c425d6708ec8aaae2f752c8b104c784ed52 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Thu, 17 Feb 2022 23:22:43 +0100
Subject: [PATCH v6 3/3] Quick exit on log stream child exit in pg_basebackup

If the log streaming child process (thread on Windows) dies during
backup then the whole backup will be aborted at the end of the
backup.  Instead, trap ungraceful termination of the log streaming
child and exit early.  This also adds a TAP test for simulating this
by terminating the responsible backend.

Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Magnus Hagander <magnus@hagander.net>
Discussion: https://postgr.es/m/0F69E282-97F9-4DB7-8D6D-F927AA6340C8@yesql.se
Discussion: https://postgr.es/m/VI1PR83MB0189818B82C19059CB62E26199A89@VI1PR83MB0189.EURPRD83.prod.outlook.com
---
 src/bin/pg_basebackup/pg_basebackup.c        | 47 +++++++++++++++++++-
 src/bin/pg_basebackup/t/010_pg_basebackup.pl | 34 ++++++++++++++
 2 files changed, 79 insertions(+), 2 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 08b07d5a06..ac289331c8 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -174,6 +174,8 @@ static int	bgpipe[2] = {-1, -1};
 /* Handle to child process */
 static pid_t bgchild = -1;
 static bool in_log_streamer = false;
+/* Flag to indicate if child process exited unexpectedly */
+static volatile sig_atomic_t bgchild_exited = false;
 
 /* End position for xlog streaming, empty string if unknown yet */
 static XLogRecPtr xlogendptr;
@@ -277,6 +279,18 @@ disconnect_atexit(void)
 }
 
 #ifndef WIN32
+/*
+ * If the bgchild exits prematurely and raises a SIGCHLD signal, we can abort
+ * processing rather than wait until the backup has finished and error out at
+ * that time. On Windows, we use a background thread which can communicate
+ * without the need for a signal handler.
+ */
+static void
+sigchld_handler(SIGNAL_ARGS)
+{
+	bgchild_exited = true;
+}
+
 /*
  * On windows, our background thread dies along with the process. But on
  * Unix, if we have started a subprocess, we want to kill it off so it
@@ -285,7 +299,7 @@ disconnect_atexit(void)
 static void
 kill_bgchild_atexit(void)
 {
-	if (bgchild > 0)
+	if (bgchild > 0 && !bgchild_exited)
 		kill(bgchild, SIGTERM);
 }
 #endif
@@ -572,17 +586,28 @@ LogStreamerMain(logstreamer_param *param)
 											  stream.do_sync);
 
 	if (!ReceiveXlogStream(param->bgconn, &stream))
-
+	{
 		/*
 		 * Any errors will already have been reported in the function process,
 		 * but we need to tell the parent that we didn't shutdown in a nice
 		 * way.
 		 */
+#ifdef WIN32
+		/*
+		 * In order to signal the main thread of an ungraceful exit we
+		 * set the same flag that we use on Unix to signal SIGCHLD.
+		 */
+		bgchild_exited = true;
+#endif
 		return 1;
+	}
 
 	if (!stream.walmethod->finish())
 	{
 		pg_log_error("could not finish writing WAL files: %m");
+#ifdef WIN32
+		bgchild_exited = true;
+#endif
 		return 1;
 	}
 
@@ -1126,6 +1151,12 @@ ReceiveCopyData(PGconn *conn, WriteDataCallback callback,
 			exit(1);
 		}
 
+		if (bgchild_exited)
+		{
+			pg_log_error("background process terminated unexpectedly");
+			exit(1);
+		}
+
 		(*callback) (r, copybuf, callback_data);
 
 		PQfreemem(copybuf);
@@ -2874,6 +2905,18 @@ main(int argc, char **argv)
 	}
 	atexit(disconnect_atexit);
 
+#ifndef WIN32
+	/*
+	 * Trap SIGCHLD to be able to handle the WAL stream process exiting. There
+	 * is no SIGCHLD on Windows, there we rely on the background thread setting
+	 * the signal variable on unexpected but graceful exit. If the WAL stream
+	 * thread crashes on Windows it will bring down the entire process as it's
+	 * a thread, so there is nothing to catch should that happen. A crash on
+	 * UNIX will be caught by the signal handler.
+	 */
+	pqsignal(SIGCHLD, sigchld_handler);
+#endif
+
 	/*
 	 * Set umask so that directories/files are created with the same
 	 * permissions as directories/files in the source data directory.
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 75d6810d3e..8cb8cfe045 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -776,4 +776,38 @@ SKIP:
 	rmtree("$tempdir/backup_gzip3");
 }
 
+# Test background stream process terminating before the basebackup has
+# finished, the main process should exit gracefully with an error message on
+# stderr. To reduce the risk of timing related issues we invoke the base
+# backup with rate throttling enabled.
+$node->safe_psql('postgres',
+	q{CREATE TABLE t AS SELECT a FROM generate_series(1,10000) AS a;});
+
+my $sigchld_bb_timeout = IPC::Run::timer(60);
+my ($sigchld_bb_stdin, $sigchld_bb_stdout, $sigchld_bb_stderr) = ('', '', '');
+my $sigchld_bb = IPC::Run::start(
+	[
+		@pg_basebackup_defs, '--wal-method=stream', '-D', "$tempdir/sigchld",
+		'--max-rate=32', '-d', $node->connstr('postgres')
+	],
+	'<',
+	\$sigchld_bb_stdin,
+	'>',
+	\$sigchld_bb_stdout,
+	'2>',
+	\$sigchld_bb_stderr,
+	$sigchld_bb_timeout);
+
+is($node->poll_query_until('postgres',
+	"SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE " .
+	"application_name = '010_pg_basebackup.pl' AND wait_event = 'WalSenderMain' " .
+	"AND backend_type = 'walsender' AND query ~ 'START_REPLICATION'"),
+	"1",
+	"Walsender killed");
+
+ok(pump_until($sigchld_bb, $sigchld_bb_timeout, \$sigchld_bb_stderr,
+  qr/background process terminated unexpectedly/),
+  'background process exit message');
+$sigchld_bb->finish();
+
 done_testing();
-- 
2.24.3 (Apple Git-128)

#15

Michael Paquier

michael@paquier.xyz

almost 4 years ago

In reply to: Daniel Gustafsson (#14)

Re: Trap errors from streaming child in pg_basebackup to exit early

On Mon, Feb 21, 2022 at 03:11:30PM +0100, Daniel Gustafsson wrote:

On 21 Feb 2022, at 03:03, Michael Paquier <michael@paquier.xyz> wrote:

+is($node->poll_query_until('postgres',
+   "SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE " .
+   "application_name = '010_pg_basebackup.pl' AND wait_event =
'WalSenderMain' " .
+   "AND backend_type = 'walsender'"), "1", "Walsender killed");
If you do that, don't you have a risk to kill the WAL sender doing the
BASE_BACKUP?  That could falsify the test.  It seems to me that it
would be safer to add a check on query ~ 'START_REPLICATION' or
something like that.

I don't think there's a risk, but I've added the check on query as well since
it also makes it more readable.

Okay, thanks.

- diag("aborting wait: program timed out");
- diag("stream contents: >>", $$stream, "<<");
- diag("pattern searched for: ", $untl);
Keeping some of this information around would be useful for
debugging in the refactored routine.

Maybe, but we don't really have diag output anywhere in the modules or the
tests so I didn't see much of a precedent for keeping it. Inspectig the repo I
think we can remove two more in pg_rewind, which I just started a thread for.

Hmm. If you think this is better this way, I won't fight hard on this
point, either.

The patch set looks fine overall.
--
Michael

#16

Daniel Gustafsson

daniel@yesql.se

almost 4 years ago

In reply to: Michael Paquier (#15)

Re: Trap errors from streaming child in pg_basebackup to exit early

On 22 Feb 2022, at 02:13, Michael Paquier <michael@paquier.xyz> wrote:

The patch set looks fine overall.

Thanks for reviewing and testing, I pushed this today after taking another
round at it.

--
Daniel Gustafsson https://vmware.com/