fsync error handling in pg_receivewal, pg_recvlogical

Started by Peter Eisentrautalmost 7 years ago8 messages
#1Peter Eisentraut
peter.eisentraut@2ndquadrant.com

Do we need to review the fsync error handling in pg_receivewal and
pg_recvlogical, following the recent backend changes? The current
default behavior is that these tools will log fsync errors and then
reconnect and proceed with the next data streaming in. As a result, you
might then have some files in the accumulated WAL that have not been
fsynced. Perhaps a hard exit would be more appropriate?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#2Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#1)
Re: fsync error handling in pg_receivewal, pg_recvlogical

On Fri, Mar 29, 2019 at 12:48:09PM +0100, Peter Eisentraut wrote:

Do we need to review the fsync error handling in pg_receivewal and
pg_recvlogical, following the recent backend changes? The current
default behavior is that these tools will log fsync errors and then
reconnect and proceed with the next data streaming in. As a result, you
might then have some files in the accumulated WAL that have not been
fsynced. Perhaps a hard exit would be more appropriate?

Yes, I think that we are going to need an equivalent of that for all
frontend tools. At various degrees, making sure that a fsync happens
is also important for pg_dump, pg_basebackup, pg_rewind and
pg_checksums so it is not only a problem of the two tools you mention.
It seems to me that the most correct way to have those failures would
be to use directly exit(EXIT_FAILURE) in file_utils.c where
appropriate.
--
Michael

#3Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Michael Paquier (#2)
1 attachment(s)
Re: fsync error handling in pg_receivewal, pg_recvlogical

On 2019-03-29 14:05, Michael Paquier wrote:

Yes, I think that we are going to need an equivalent of that for all
frontend tools. At various degrees, making sure that a fsync happens
is also important for pg_dump, pg_basebackup, pg_rewind and
pg_checksums so it is not only a problem of the two tools you mention.
It seems to me that the most correct way to have those failures would
be to use directly exit(EXIT_FAILURE) in file_utils.c where
appropriate.

Yeah, there is more to do. The reason I'm focusing on these two right
now is that they would typically run as a background service, and a
clean exit is most important there. In the other cases, the program
runs more often in the foreground and you can see error messages. There
are also some cases where fsync() failures are intentionally ignored
((void) casts), so some of that would need to be investigated further.

Here is a patch to get started. Note that these calls don't go through
file_utils.c, so it's a separate issue anyway.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-Handle-fsync-failures-in-pg_receivewal-and-pg_recvlo.patchtext/plain; charset=UTF-8; name=0001-Handle-fsync-failures-in-pg_receivewal-and-pg_recvlo.patch; x-mac-creator=0; x-mac-type=0Download
From 8f275321775f41d4a46ea6e1b10765c68fe2653d Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Tue, 25 Jun 2019 14:19:26 +0200
Subject: [PATCH] Handle fsync failures in pg_receivewal and pg_recvlogical

It is not safe to simply report an fsync error and continue.  We must
exit the program instead.
---
 src/bin/pg_basebackup/pg_recvlogical.c |  4 ++--
 src/bin/pg_basebackup/receivelog.c     | 12 ++++++------
 src/bin/pg_basebackup/walmethods.c     |  2 +-
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_recvlogical.c b/src/bin/pg_basebackup/pg_recvlogical.c
index b029118bf6..9d8f8c74f8 100644
--- a/src/bin/pg_basebackup/pg_recvlogical.c
+++ b/src/bin/pg_basebackup/pg_recvlogical.c
@@ -192,8 +192,8 @@ OutputFsync(TimestampTz now)
 
 	if (fsync(outfd) != 0)
 	{
-		pg_log_error("could not fsync file \"%s\": %m", outfile);
-		return false;
+		pg_log_fatal("could not fsync file \"%s\": %m", outfile);
+		exit(2);
 	}
 
 	return true;
diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
index 2fd16421aa..81fc1b0a01 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -134,10 +134,10 @@ open_walfile(StreamCtl *stream, XLogRecPtr startpoint)
 			/* fsync file in case of a previous crash */
 			if (stream->walmethod->sync(f) != 0)
 			{
-				pg_log_error("could not fsync existing write-ahead log file \"%s\": %s",
+				pg_log_fatal("could not fsync existing write-ahead log file \"%s\": %s",
 							 fn, stream->walmethod->getlasterror());
 				stream->walmethod->close(f, CLOSE_UNLINK);
-				return false;
+				exit(2);
 			}
 
 			walfile = f;
@@ -763,9 +763,9 @@ HandleCopyStream(PGconn *conn, StreamCtl *stream,
 		{
 			if (stream->walmethod->sync(walfile) != 0)
 			{
-				pg_log_error("could not fsync file \"%s\": %s",
+				pg_log_fatal("could not fsync file \"%s\": %s",
 							 current_walfile_name, stream->walmethod->getlasterror());
-				goto error;
+				exit(2);
 			}
 			lastFlushPosition = blockpos;
 
@@ -1015,9 +1015,9 @@ ProcessKeepaliveMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
 			 */
 			if (stream->walmethod->sync(walfile) != 0)
 			{
-				pg_log_error("could not fsync file \"%s\": %s",
+				pg_log_fatal("could not fsync file \"%s\": %s",
 							 current_walfile_name, stream->walmethod->getlasterror());
-				return false;
+				exit(2);
 			}
 			lastFlushPosition = blockpos;
 		}
diff --git a/src/bin/pg_basebackup/walmethods.c b/src/bin/pg_basebackup/walmethods.c
index 83b520898b..d8abea6dc6 100644
--- a/src/bin/pg_basebackup/walmethods.c
+++ b/src/bin/pg_basebackup/walmethods.c
@@ -864,7 +864,7 @@ tar_close(Walfile f, WalCloseMethod method)
 
 	/* Always fsync on close, so the padding gets fsynced */
 	if (tar_sync(f) < 0)
-		return -1;
+		exit(2);
 
 	/* Clean up and done */
 	pg_free(tf->pathname);
-- 
2.22.0

#4Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#3)
Re: fsync error handling in pg_receivewal, pg_recvlogical

On Tue, Jun 25, 2019 at 02:23:05PM +0200, Peter Eisentraut wrote:

Yeah, there is more to do. The reason I'm focusing on these two right
now is that they would typically run as a background service, and a
clean exit is most important there. In the other cases, the program
runs more often in the foreground and you can see error messages. There
are also some cases where fsync() failures are intentionally ignored
((void) casts), so some of that would need to be investigated further.

The remaining three calls all go through file_utils.c.

Here is a patch to get started. Note that these calls don't go through
file_utils.c, so it's a separate issue anyway.

Why using a different error code. Using EXIT_FAILURE is a more common
practice in the in-core binaries. The patch looks fine to me except
that, that's a good first cut.
--
Michael

#5Sehrope Sarkuni
sehrope@jackdb.com
In reply to: Michael Paquier (#4)
Re: fsync error handling in pg_receivewal, pg_recvlogical

Hi,

Tried out this patch and it applies, compiles, and passes check-world. Also
flipped things around in pg_recvlogical.c to exit-on-success to ensure it's
actually being called and that worked too. Outside of a more complicated
harness that simulates fsync errors not sure how else to test this further.

I did some searching and found a FUSE based on that looks interesting:
CharybdeFS[1]https://github.com/scylladb/charybdefs. Rather than being fixed at mount time, it has a
client/server interface so you can change the handling of syscalls on the
fly[2]https://www.scylladb.com/2016/05/02/fault-injection-filesystem-cookbook/. For example you can error out fsync calls halfway through a test
rather than always or randomly. Haven't tried it out but leaving it here as
it seems relevant.

[1]: https://github.com/scylladb/charybdefs
[2]: https://www.scylladb.com/2016/05/02/fault-injection-filesystem-cookbook/
https://www.scylladb.com/2016/05/02/fault-injection-filesystem-cookbook/

On Wed, Jun 26, 2019 at 12:11 AM Michael Paquier <michael@paquier.xyz>
wrote:

Why using a different error code. Using EXIT_FAILURE is a more common
practice in the in-core binaries. The patch looks fine to me except
that, that's a good first cut.

An error code specific to fsync issues could help with tests as the harness
could check it to ensure things died for the right reasons. With a generic
"messed up fsync" harness you might even be able to run some existing tests
that would otherwise pass and check for the fsync-specific exit code.

Regards,
-- Sehrope Sarkuni
Founder & CEO | JackDB, Inc. | https://www.jackdb.com/

#6Sehrope Sarkuni
sehrope@jackdb.com
In reply to: Sehrope Sarkuni (#5)
Re: fsync error handling in pg_receivewal, pg_recvlogical

While reviewing this patch I read through some of the other fsync
callsites and noticed this typo (walkdir is in file_utils.c, not
initdb.c) too:

diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index 315c74c745..9b79df2d7f 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -3208,7 +3208,7 @@ SyncDataDirectory(void)
  *
  * Errors are reported at level elevel, which might be ERROR or less.
  *
- * See also walkdir in initdb.c, which is a frontend version of this logic.
+ * See also walkdir in file_utils.c, which is a frontend version of this logic.
  */
 static void
 walkdir(const char *path,

Regards,
-- Sehrope Sarkuni
Founder & CEO | JackDB, Inc. | https://www.jackdb.com/

#7Michael Paquier
michael@paquier.xyz
In reply to: Sehrope Sarkuni (#6)
Re: fsync error handling in pg_receivewal, pg_recvlogical

On Sat, Jul 27, 2019 at 01:06:06PM -0400, Sehrope Sarkuni wrote:

While reviewing this patch I read through some of the other fsync
callsites and noticed this typo (walkdir is in file_utils.c, not
initdb.c) too:

Thanks, Sehrope. Applied.
--
Michael

#8Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Sehrope Sarkuni (#5)
Re: fsync error handling in pg_receivewal, pg_recvlogical

On 2019-07-27 19:02, Sehrope Sarkuni wrote:

Tried out this patch and it applies, compiles, and passes check-world.
Also flipped things around in pg_recvlogical.c to exit-on-success to
ensure it's actually being called and that worked too. Outside of a more
complicated harness that simulates fsync errors not sure how else to
test this further.

I have committed this, with the exit code changed back, as requested by
Michael.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services