PostgreSQL doesn't stop propley when --slot option is specified with pg_receivexlog.

Started by Nonameabout 11 years ago9 messages
#1Noname
furuyao@pm.nttdata.co.jp

Hi,

"pg_ctl stop" does't work propley, if --slot option is specified when WAL is flushed only it has switched.
These processes still continue even after the posmaster failed:pg_receivexlog, walsender and logger.

How to reproduce:
1.Start PostgreSQL
2.Create slot
3.Specify --slot option to pg_receivexlog and start it
4.Stop PostgreSQL

Example commands for reproduce:
1.pg_ctl start
2.psql -c "select pg_create_physical_replication_slot('test');"
3.pg_receivexlog --slot='test' -D ./testdir
4.pg_ctl stop

This happen cause --slot option set a flush location to feedback messages,
but it only flush when wal has switched, so walsender wait for the WAL have been replicated forever.

When --slot option is not specified, 'invalid' is set to flush location and it use write location to check replication so it can stop the process propley.

So I thought set 'invalid' to flush location as well in this case, this problem will be solved.
Does --slot option need to set flush location?
Thought?

Regards,

--
Furuya Osamu

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Fujii Masao
masao.fujii@gmail.com
In reply to: Noname (#1)
Re: PostgreSQL doesn't stop propley when --slot option is specified with pg_receivexlog.

On Fri, Nov 14, 2014 at 7:22 PM, <furuyao@pm.nttdata.co.jp> wrote:

Hi,

"pg_ctl stop" does't work propley, if --slot option is specified when WAL is flushed only it has switched.
These processes still continue even after the posmaster failed:pg_receivexlog, walsender and logger.

I could reproduce this problem. At normal shutdown, walsender keeps waiting
for the last WAL record to be replicated and flushed in pg_receivexlog. But
pg_receivexlog issues sync command only when WAL file is switched. Thus,
since pg_receivexlog may never flush the last WAL record, walsender may have
to keep waiting infinitely.

pg_recvlogical handles this problem by calling fsync() when it receives the
request of immediate reply from the server. That is, at shutdown, walsender
sends the request, pg_receivexlog receives it, flushes the last WAL record,
and sends the flush location back to the server. Since walsender can see that
the last WAL record is successfully flushed in pg_receivexlog, it can
exit cleanly.

One idea to the problem is to introduce the same logic as pg_recvlogical has,
to pg_receivexlog. Thought?

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Andres Freund
andres@2ndquadrant.com
In reply to: Fujii Masao (#2)
Re: PostgreSQL doesn't stop propley when --slot option is specified with pg_receivexlog.

On 2014-11-15 03:25:16 +0900, Fujii Masao wrote:

On Fri, Nov 14, 2014 at 7:22 PM, <furuyao@pm.nttdata.co.jp> wrote:

Hi,

"pg_ctl stop" does't work propley, if --slot option is specified when WAL is flushed only it has switched.
These processes still continue even after the posmaster failed:pg_receivexlog, walsender and logger.

I could reproduce this problem. At normal shutdown, walsender keeps waiting
for the last WAL record to be replicated and flushed in pg_receivexlog. But
pg_receivexlog issues sync command only when WAL file is switched. Thus,
since pg_receivexlog may never flush the last WAL record, walsender may have
to keep waiting infinitely.

Right.

pg_recvlogical handles this problem by calling fsync() when it receives the
request of immediate reply from the server. That is, at shutdown, walsender
sends the request, pg_receivexlog receives it, flushes the last WAL record,
and sends the flush location back to the server. Since walsender can see that
the last WAL record is successfully flushed in pg_receivexlog, it can
exit cleanly.

One idea to the problem is to introduce the same logic as pg_recvlogical has,
to pg_receivexlog. Thought?

Sounds sane to me. Are you looking into doing that?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Michael Paquier
michael.paquier@gmail.com
In reply to: Andres Freund (#3)
1 attachment(s)
Re: PostgreSQL doesn't stop propley when --slot option is specified with pg_receivexlog.

On Sat, Nov 15, 2014 at 3:42 AM, Andres Freund <andres@2ndquadrant.com> wrote:

On 2014-11-15 03:25:16 +0900, Fujii Masao wrote:

On Fri, Nov 14, 2014 at 7:22 PM, <furuyao@pm.nttdata.co.jp> wrote:

"pg_ctl stop" does't work propley, if --slot option is specified when WAL is flushed only it has switched.
These processes still continue even after the posmaster failed:pg_receivexlog, walsender and logger.

I could reproduce this problem. At normal shutdown, walsender keeps waiting
for the last WAL record to be replicated and flushed in pg_receivexlog. But
pg_receivexlog issues sync command only when WAL file is switched. Thus,
since pg_receivexlog may never flush the last WAL record, walsender may have
to keep waiting infinitely.

Right.

It is surprising that nobody complained about that before,
pg_receivexlog has been released two years ago.

pg_recvlogical handles this problem by calling fsync() when it receives the
request of immediate reply from the server. That is, at shutdown, walsender
sends the request, pg_receivexlog receives it, flushes the last WAL record,
and sends the flush location back to the server. Since walsender can see that
the last WAL record is successfully flushed in pg_receivexlog, it can
exit cleanly.

One idea to the problem is to introduce the same logic as pg_recvlogical has,
to pg_receivexlog. Thought?

Sounds sane to me. Are you looking into doing that?

Yep, sounds a good thing to do if master requested answer from the
client in the keepalive message. Something like the patch attached
would make the deal.
--
Michael

Attachments:

20141115_pg_receivexlog_flush_ptr.patchtext/x-patch; charset=US-ASCII; name=20141115_pg_receivexlog_flush_ptr.patchDownload
diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
index c6c90fb..b93d52b 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -149,6 +149,34 @@ open_walfile(XLogRecPtr startpoint, uint32 timeline, char *basedir,
 }
 
 /*
+ * Flush the current WAL file to disk and update the last WAL flush
+ * positions as well as the last fsync timestamp. On failure, print
+ * a message to stderr and return false, otherwise return true.
+ */
+static bool
+fsync_walfile(XLogRecPtr blockpos, int64 timestamp)
+{
+	/* nothing to do if no WAL file open */
+	if (walfile == -1)
+		return true;
+
+	/* nothing to do if data has already been flushed */
+	if (blockpos <= lastFlushPosition)
+		return true;
+
+	if (fsync(walfile) != 0)
+	{
+		fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
+				progname, current_walfile_name, strerror(errno));
+		return false;
+	}
+
+	lastFlushPosition = blockpos;
+	last_fsync = timestamp;
+	return true;
+}
+
+/*
  * Close the current WAL file (if open), and rename it to the correct
  * filename if it's complete. On failure, prints an error message to stderr
  * and returns false, otherwise returns true.
@@ -787,21 +815,12 @@ HandleCopyStream(PGconn *conn, XLogRecPtr startpos, uint32 timeline,
 		 * If fsync_interval has elapsed since last WAL flush and we've written
 		 * some WAL data, flush them to disk.
 		 */
-		if (lastFlushPosition < blockpos &&
-			walfile != -1 &&
-			((fsync_interval > 0 &&
-			  feTimestampDifferenceExceeds(last_fsync, now, fsync_interval)) ||
-			 fsync_interval < 0))
+		if ((fsync_interval > 0 &&
+			 feTimestampDifferenceExceeds(last_fsync, now, fsync_interval)) ||
+			 fsync_interval < 0)
 		{
-			if (fsync(walfile) != 0)
-			{
-				fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
-						progname, current_walfile_name, strerror(errno));
+			if (!fsync_walfile(blockpos, now))
 				goto error;
-			}
-
-			lastFlushPosition = blockpos;
-			last_fsync = now;
 		}
 
 		/*
@@ -999,7 +1018,6 @@ ProcessKeepaliveMsg(PGconn *conn, char *copybuf, int len,
 {
 	int			pos;
 	bool		replyRequested;
-	int64		now;
 
 	/*
 	 * Parse the keepalive message, enclosed in the CopyData message.
@@ -1021,7 +1039,12 @@ ProcessKeepaliveMsg(PGconn *conn, char *copybuf, int len,
 	/* If the server requested an immediate reply, send one. */
 	if (replyRequested && still_sending)
 	{
-		now = feGetCurrentTimestamp();
+		int64		now = feGetCurrentTimestamp();
+
+		/* fsync data, so a fresh flush position is sent */
+		if (!fsync_walfile(blockpos, now))
+			return false;
+
 		if (!sendFeedback(conn, blockpos, now, false))
 			return false;
 		*last_status = now;
#5Fujii Masao
masao.fujii@gmail.com
In reply to: Michael Paquier (#4)
Re: PostgreSQL doesn't stop propley when --slot option is specified with pg_receivexlog.

On Sat, Nov 15, 2014 at 9:10 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

On Sat, Nov 15, 2014 at 3:42 AM, Andres Freund <andres@2ndquadrant.com> wrote:

On 2014-11-15 03:25:16 +0900, Fujii Masao wrote:

On Fri, Nov 14, 2014 at 7:22 PM, <furuyao@pm.nttdata.co.jp> wrote:

"pg_ctl stop" does't work propley, if --slot option is specified when WAL is flushed only it has switched.
These processes still continue even after the posmaster failed:pg_receivexlog, walsender and logger.

I could reproduce this problem. At normal shutdown, walsender keeps waiting
for the last WAL record to be replicated and flushed in pg_receivexlog. But
pg_receivexlog issues sync command only when WAL file is switched. Thus,
since pg_receivexlog may never flush the last WAL record, walsender may have
to keep waiting infinitely.

Right.

It is surprising that nobody complained about that before,
pg_receivexlog has been released two years ago.

It's not so surprising because the problem can happen only when
replication slot is specified, i.e., the version is 9.4 or later.

pg_recvlogical handles this problem by calling fsync() when it receives the
request of immediate reply from the server. That is, at shutdown, walsender
sends the request, pg_receivexlog receives it, flushes the last WAL record,
and sends the flush location back to the server. Since walsender can see that
the last WAL record is successfully flushed in pg_receivexlog, it can
exit cleanly.

One idea to the problem is to introduce the same logic as pg_recvlogical has,
to pg_receivexlog. Thought?

Sounds sane to me. Are you looking into doing that?

Yep, sounds a good thing to do if master requested answer from the
client in the keepalive message. Something like the patch attached
would make the deal.

Isn't it better to do this only when replication slot is used?

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Michael Paquier
michael.paquier@gmail.com
In reply to: Fujii Masao (#5)
1 attachment(s)
Re: PostgreSQL doesn't stop propley when --slot option is specified with pg_receivexlog.

On Mon, Nov 17, 2014 at 10:02 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Sat, Nov 15, 2014 at 9:10 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

Yep, sounds a good thing to do if master requested answer from the
client in the keepalive message. Something like the patch attached
would make the deal.

Isn't it better to do this only when replication slot is used?

Makes sense. What about a check using reportFlushPosition then?
--
Michael

Attachments:

20141117_pg_receivexlog_flush_ptr.patchtext/x-patch; charset=US-ASCII; name=20141117_pg_receivexlog_flush_ptr.patchDownload
diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
index c6c90fb..b426cfe 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -149,6 +149,34 @@ open_walfile(XLogRecPtr startpoint, uint32 timeline, char *basedir,
 }
 
 /*
+ * Flush the current WAL file to disk and update the last WAL flush
+ * positions as well as the last fsync timestamp. On failure, print
+ * a message to stderr and return false, otherwise return true.
+ */
+static bool
+fsync_walfile(XLogRecPtr blockpos, int64 timestamp)
+{
+	/* nothing to do if no WAL file open */
+	if (walfile == -1)
+		return true;
+
+	/* nothing to do if data has already been flushed */
+	if (blockpos <= lastFlushPosition)
+		return true;
+
+	if (fsync(walfile) != 0)
+	{
+		fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
+				progname, current_walfile_name, strerror(errno));
+		return false;
+	}
+
+	lastFlushPosition = blockpos;
+	last_fsync = timestamp;
+	return true;
+}
+
+/*
  * Close the current WAL file (if open), and rename it to the correct
  * filename if it's complete. On failure, prints an error message to stderr
  * and returns false, otherwise returns true.
@@ -787,21 +815,12 @@ HandleCopyStream(PGconn *conn, XLogRecPtr startpos, uint32 timeline,
 		 * If fsync_interval has elapsed since last WAL flush and we've written
 		 * some WAL data, flush them to disk.
 		 */
-		if (lastFlushPosition < blockpos &&
-			walfile != -1 &&
-			((fsync_interval > 0 &&
-			  feTimestampDifferenceExceeds(last_fsync, now, fsync_interval)) ||
-			 fsync_interval < 0))
+		if ((fsync_interval > 0 &&
+			 feTimestampDifferenceExceeds(last_fsync, now, fsync_interval)) ||
+			 fsync_interval < 0)
 		{
-			if (fsync(walfile) != 0)
-			{
-				fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
-						progname, current_walfile_name, strerror(errno));
+			if (!fsync_walfile(blockpos, now))
 				goto error;
-			}
-
-			lastFlushPosition = blockpos;
-			last_fsync = now;
 		}
 
 		/*
@@ -999,7 +1018,6 @@ ProcessKeepaliveMsg(PGconn *conn, char *copybuf, int len,
 {
 	int			pos;
 	bool		replyRequested;
-	int64		now;
 
 	/*
 	 * Parse the keepalive message, enclosed in the CopyData message.
@@ -1021,7 +1039,15 @@ ProcessKeepaliveMsg(PGconn *conn, char *copybuf, int len,
 	/* If the server requested an immediate reply, send one. */
 	if (replyRequested && still_sending)
 	{
-		now = feGetCurrentTimestamp();
+		int64		now = feGetCurrentTimestamp();
+
+		/*
+		 * Flush data, so a fresh position is sent but do this only if a
+		 * flush position needs to be reported.
+		 */
+		if (reportFlushPosition && !fsync_walfile(blockpos, now))
+			return false;
+
 		if (!sendFeedback(conn, blockpos, now, false))
 			return false;
 		*last_status = now;
#7Fujii Masao
masao.fujii@gmail.com
In reply to: Michael Paquier (#6)
Re: PostgreSQL doesn't stop propley when --slot option is specified with pg_receivexlog.

On Mon, Nov 17, 2014 at 12:30 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

On Mon, Nov 17, 2014 at 10:02 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Sat, Nov 15, 2014 at 9:10 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

Yep, sounds a good thing to do if master requested answer from the
client in the keepalive message. Something like the patch attached
would make the deal.

Isn't it better to do this only when replication slot is used?

Makes sense. What about a check using reportFlushPosition then?

Sounds reasonable. Thanks for updating the patch!
But the patch could not already be applied to the master cleanly
because c4f99d2 heavily changed the code that the patch also touches...
I rewrote the patch and pushed it to both master and REL9_4_STABLE.
Anyway, thanks!

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#7)
Re: PostgreSQL doesn't stop propley when --slot option is specified with pg_receivexlog.

On Wed, Nov 19, 2014 at 6:16 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Mon, Nov 17, 2014 at 12:30 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

On Mon, Nov 17, 2014 at 10:02 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Sat, Nov 15, 2014 at 9:10 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

Yep, sounds a good thing to do if master requested answer from the
client in the keepalive message. Something like the patch attached
would make the deal.

Isn't it better to do this only when replication slot is used?

Makes sense. What about a check using reportFlushPosition then?

Sounds reasonable. Thanks for updating the patch!
But the patch could not already be applied to the master cleanly
because c4f99d2 heavily changed the code that the patch also touches...
I rewrote the patch and pushed it to both master and REL9_4_STABLE.
Anyway, thanks!

Is this:

+       if (reportFlushPosition && lastFlushPosition < blockpos &&
+           walfile != 1)

really correct? Shouldn't that walfile test be against -1 (minus one)?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#8)
Re: PostgreSQL doesn't stop propley when --slot option is specified with pg_receivexlog.

On Wed, Nov 19, 2014 at 5:15 PM, Magnus Hagander <magnus@hagander.net> wrote:

On Wed, Nov 19, 2014 at 6:16 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Mon, Nov 17, 2014 at 12:30 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

On Mon, Nov 17, 2014 at 10:02 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Sat, Nov 15, 2014 at 9:10 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

Yep, sounds a good thing to do if master requested answer from the
client in the keepalive message. Something like the patch attached
would make the deal.

Isn't it better to do this only when replication slot is used?

Makes sense. What about a check using reportFlushPosition then?

Sounds reasonable. Thanks for updating the patch!
But the patch could not already be applied to the master cleanly
because c4f99d2 heavily changed the code that the patch also touches...
I rewrote the patch and pushed it to both master and REL9_4_STABLE.
Anyway, thanks!

Is this:

+       if (reportFlushPosition && lastFlushPosition < blockpos &&
+           walfile != 1)

really correct? Shouldn't that walfile test be against -1 (minus one)?

Ooops... You're right. That's really mistake... Just fixed and pushed. Thanks!

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers