Patch: add recovery_timeout option to control timeout of restore_command nonzero status code
Hello everyone.
* Project name: Add recovery_timeout option to control timeout of restore_command nonzero status code
* Uniquely identifiable file name, so we can tell difference between your v1 and v24: 0001-add-recovery_timeout-to-controll-timeout-between-res.patch
* What the patch does in a short paragraph: This patch should add option recovery_timeout, which help to control timeout of restore_command nonzero status code. Right now default value is 5 seconds. This is useful, if I using for restore of wal logs some external storage (like AWS S3) and no matter what the slave database will lag behind the master. The problem, what for each request to AWS S3 need to pay, what is why for N nodes, which try to get next wal log each 5 seconds will be bigger price, than for example each 30 seconds. Before I do this in this way: " if ! (/usr/local/bin/envdir /etc/wal-e.d/env /usr/local/bin/wal-e wal-fetch "%f" "%p"); then sleep 60; fi ". But in this case restart/stop database slower.
* Whether the patch is for discussion or for application: No such thing.
* Which branch the patch is against: master branch
* Whether it compiles and tests successfully, so we know nothing obvious is broken: compiled and pass tests on local mashine.
* Whether it contains any platform-specific items and if so, has it been tested on other platforms: hope, no.
* Confirm that the patch includes regression tests to check the new feature actually works as described: No it doesn't have test. I don't found ho to testing new config variables.
* Include documentation: added.
* Describe the effect your patch has on performance, if any: shouldn't effect on database performance.
This is my first patch. I am not sure about name of option. Maybe it should called "recovery_nonzero_timeout".
--
Alexey Vasiliev
Attachments:
0001-add-recovery_timeout-to-controll-timeout-between-res.patchapplication/x-patch; name="=?UTF-8?B?MDAwMS1hZGQtcmVjb3ZlcnlfdGltZW91dC10by1jb250cm9sbC10aW1lb3V0?= =?UTF-8?B?LWJldHdlZW4tcmVzLnBhdGNo?="Download
From 35abe56b2497f238a6888fe98c54aa9cb5300866 Mon Sep 17 00:00:00 2001
From: Alexey Vasiliev <leopard.not.a@gmail.com>
Date: Mon, 3 Nov 2014 00:21:14 +0200
Subject: [PATCH] add recovery_timeout to controll timeout between
restore_command nonzero
---
doc/src/sgml/recovery-config.sgml | 16 ++++++++++++++++
src/backend/access/transam/recovery.conf.sample | 5 +++++
src/backend/access/transam/xlog.c | 17 ++++++++++++++++-
3 files changed, 37 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index 0f1ff34..bc410b3 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -145,6 +145,22 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
</listitem>
</varlistentry>
+ <varlistentry id="restore-timeout" xreflabel="restore_timeout">
+ <term><varname>restore_timeout</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>restore_timeout</> recovery parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ By default, if <varname>restore_command</> return nonzero status,
+ server will retry command again after 5 seconds. This parameter
+ allow to change this time. This parameter is optional. This can
+ be useful to increase/decrease number of a restore_command calls.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect1>
diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample
index 7657df3..282e898 100644
--- a/src/backend/access/transam/recovery.conf.sample
+++ b/src/backend/access/transam/recovery.conf.sample
@@ -58,6 +58,11 @@
#
#recovery_end_command = ''
#
+# specifies an optional timeout after nonzero code of restore_command.
+# This can be useful to increase/decrease number of a restore_command calls.
+#
+#restore_timeout = ''
+#
#---------------------------------------------------------------------------
# RECOVERY TARGET PARAMETERS
#---------------------------------------------------------------------------
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 3c9aeae..98f0fca 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -233,6 +233,7 @@ static TimestampTz recoveryTargetTime;
static char *recoveryTargetName;
static int recovery_min_apply_delay = 0;
static TimestampTz recoveryDelayUntilTime;
+static int restore_timeout = 0;
/* options taken from recovery.conf for XLOG streaming */
static bool StandbyModeRequested = false;
@@ -5245,6 +5246,20 @@ readRecoveryCommandFile(void)
(errmsg_internal("trigger_file = '%s'",
TriggerFile)));
}
+ else if (strcmp(item->name, "restore_timeout") == 0)
+ {
+ const char *hintmsg;
+
+ if (!parse_int(item->value, &restore_timeout, GUC_UNIT_MS,
+ &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("parameter \"%s\" requires a temporal value",
+ "restore_timeout"),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+ ereport(DEBUG2,
+ (errmsg_internal("restore_timeout = '%s'", item->value)));
+ }
else if (strcmp(item->name, "recovery_min_apply_delay") == 0)
{
const char *hintmsg;
@@ -11110,7 +11125,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
*/
WaitLatch(&XLogCtl->recoveryWakeupLatch,
WL_LATCH_SET | WL_TIMEOUT,
- 5000L);
+ restore_timeout > 0 ? restore_timeout : 5000L);
ResetLatch(&XLogCtl->recoveryWakeupLatch);
break;
}
--
1.9.3 (Apple Git-50)
On 11/3/14 6:04 AM, Alexey Vasiliev wrote:
3. What the patch does in a short paragraph: This patch should add
option recovery_timeout, which help to control timeout of
restore_command nonzero status code. Right now default value is 5
seconds. This is useful, if I using for restore of wal logs some
external storage (like AWS S3) and no matter what the slave database
will lag behind the master. The problem, what for each request to
AWS S3 need to pay, what is why for N nodes, which try to get next
wal log each 5 seconds will be bigger price, than for example each
30 seconds.
That seems useful. I would include something about this use case in the
documentation.
This is my first patch. I am not sure about name of option. Maybe it
should called "recovery_nonzero_timeout".
The option name had me confused. At first I though this is the time
after which a running restore_command invocation gets killed. I think a
more precise description might be restore_command_retry_interval.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Mon, 03 Nov 2014 14:36:33 -0500 от Peter Eisentraut <peter_e@gmx.net>:
On 11/3/14 6:04 AM, Alexey Vasiliev wrote:
3. What the patch does in a short paragraph: This patch should add
option recovery_timeout, which help to control timeout of
restore_command nonzero status code. Right now default value is 5
seconds. This is useful, if I using for restore of wal logs some
external storage (like AWS S3) and no matter what the slave database
will lag behind the master. The problem, what for each request to
AWS S3 need to pay, what is why for N nodes, which try to get next
wal log each 5 seconds will be bigger price, than for example each
30 seconds.That seems useful. I would include something about this use case in the
documentation.
Ok, I will add this in patch.
This is my first patch. I am not sure about name of option. Maybe it
should called "recovery_nonzero_timeout".The option name had me confused. At first I though this is the time
after which a running restore_command invocation gets killed. I think a
more precise description might be restore_command_retry_interval.
"restore_command_retry_interval" - I like this name!
Should I change my patch and send it by email? And also as I understand I should change message ID for https://commitfest.postgresql.org/action/patch_view?id=1636, isn't it?
Thanks
--
Alexey Vasiliev
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Should I change my patch and send it by email? And also as I understand I
should change message ID for
https://commitfest.postgresql.org/action/patch_view?id=1636, isn't it?
You should just send another version of your patch by email and add a new
"comment" to commit-fest using the "Patch" comment type.
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
Show quoted text
Timbira: http://www.timbira.com.br
Blog: http://fabriziomello.github.io
Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello
Github: http://github.com/fabriziomello
Mon, 3 Nov 2014 19:18:51 -0200 от Fabrízio de Royes Mello <fabriziomello@gmail.com>:
Should I change my patch and send it by email? And also as I understand I should change message ID for https://commitfest.postgresql.org/action/patch_view?id=1636 , isn't it?
You should just send another version of your patch by email and add a new "comment" to commit-fest using the "Patch" comment type.
Added new patch.
--
Alexey Vasiliev
Attachments:
0001-add-recovery_timeout-to-controll-timeout-between-res.patchapplication/x-patch; name="=?UTF-8?B?MDAwMS1hZGQtcmVjb3ZlcnlfdGltZW91dC10by1jb250cm9sbC10aW1lb3V0?= =?UTF-8?B?LWJldHdlZW4tcmVzLnBhdGNo?="Download
From a5d941717df71e4c16941b8a02f0dca40a1107a0 Mon Sep 17 00:00:00 2001
From: Alexey Vasiliev <leopard.not.a@gmail.com>
Date: Mon, 3 Nov 2014 00:21:14 +0200
Subject: [PATCH] add recovery_timeout to controll timeout between
restore_command nonzero
---
doc/src/sgml/recovery-config.sgml | 24 ++++++++++++++++++++++++
src/backend/access/transam/recovery.conf.sample | 5 +++++
src/backend/access/transam/xlog.c | 20 ++++++++++++++++++--
3 files changed, 47 insertions(+), 2 deletions(-)
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index 0f1ff34..98eb451 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -145,6 +145,30 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
</listitem>
</varlistentry>
+ <varlistentry id="restore-command-retry-interval" xreflabel="restore_command_retry_interval">
+ <term><varname>restore_command_retry_interval</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>restore_command_retry_interval</> recovery parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ By default, if <varname>restore_command</> return nonzero status,
+ server will retry command again after 5 seconds. This parameter
+ allow to change this time. This parameter is optional. This can
+ be useful to increase/decrease number of a restore_command calls.
+ </para>
+ <para>
+ This is useful, if I using for restore of wal logs some
+ external storage (like AWS S3) and no matter what the slave database
+ will lag behind the master. The problem, what for each request to
+ AWS S3 need to pay, what is why for N nodes, which try to get next
+ wal log each 5 seconds will be bigger price, than for example each
+ 30 seconds.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect1>
diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample
index 7657df3..4eee8f4 100644
--- a/src/backend/access/transam/recovery.conf.sample
+++ b/src/backend/access/transam/recovery.conf.sample
@@ -58,6 +58,11 @@
#
#recovery_end_command = ''
#
+# specifies an optional timeout after nonzero code of restore_command.
+# This can be useful to increase/decrease number of a restore_command calls.
+#
+#restore_command_retry_interval = 5s
+#
#---------------------------------------------------------------------------
# RECOVERY TARGET PARAMETERS
#---------------------------------------------------------------------------
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 3c9aeae..c26101e 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -233,6 +233,7 @@ static TimestampTz recoveryTargetTime;
static char *recoveryTargetName;
static int recovery_min_apply_delay = 0;
static TimestampTz recoveryDelayUntilTime;
+static int restore_command_retry_interval = 0;
/* options taken from recovery.conf for XLOG streaming */
static bool StandbyModeRequested = false;
@@ -5245,6 +5246,20 @@ readRecoveryCommandFile(void)
(errmsg_internal("trigger_file = '%s'",
TriggerFile)));
}
+ else if (strcmp(item->name, "restore_command_retry_interval") == 0)
+ {
+ const char *hintmsg;
+
+ if (!parse_int(item->value, &restore_command_retry_interval, GUC_UNIT_MS,
+ &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("parameter \"%s\" requires a temporal value",
+ "restore_command_retry_interval"),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+ ereport(DEBUG2,
+ (errmsg_internal("restore_command_retry_interval = '%s'", item->value)));
+ }
else if (strcmp(item->name, "recovery_min_apply_delay") == 0)
{
const char *hintmsg;
@@ -11104,13 +11119,14 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
}
/*
- * Wait for more WAL to arrive. Time out after 5 seconds,
+ * Wait for more WAL to arrive. Time out after
+ * restore_command_retry_interval (5 seconds by default),
* like when polling the archive, to react to a trigger
* file promptly.
*/
WaitLatch(&XLogCtl->recoveryWakeupLatch,
WL_LATCH_SET | WL_TIMEOUT,
- 5000L);
+ restore_command_retry_interval > 0 ? restore_command_retry_interval : 5000L);
ResetLatch(&XLogCtl->recoveryWakeupLatch);
break;
}
--
1.9.3 (Apple Git-50)
On Mon, Nov 3, 2014 at 7:25 PM, Alexey Vasiliev <leopard_ne@inbox.ru> wrote:
Mon, 3 Nov 2014 19:18:51 -0200 от Fabrízio de Royes Mello <
fabriziomello@gmail.com>:
Should I change my patch and send it by email? And also as I
understand I should change message ID for
https://commitfest.postgresql.org/action/patch_view?id=1636 , isn't it?
You should just send another version of your patch by email and add a
new "comment" to commit-fest using the "Patch" comment type.
Added new patch.
And you should add the patch to an open commit-fest not to an in-progress.
The next opened is 2014-12 [1]https://commitfest.postgresql.org/action/commitfest_view?id=25.
Regards,
[1]: https://commitfest.postgresql.org/action/commitfest_view?id=25
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
Show quoted text
Timbira: http://www.timbira.com.br
Blog: http://fabriziomello.github.io
Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello
Github: http://github.com/fabriziomello
Mon, 3 Nov 2014 22:55:02 -0200 от Fabrízio de Royes Mello <fabriziomello@gmail.com>:
On Mon, Nov 3, 2014 at 7:25 PM, Alexey Vasiliev < leopard_ne@inbox.ru > wrote:
Mon, 3 Nov 2014 19:18:51 -0200 от Fabrízio de Royes Mello < fabriziomello@gmail.com >:
Should I change my patch and send it by email? And also as I understand I should change message ID for https://commitfest.postgresql.org/action/patch_view?id=1636 , isn't it?
You should just send another version of your patch by email and add a new "comment" to commit-fest using the "Patch" comment type.
Added new patch.
And you should add the patch to an open commit-fest not to an in-progress. The next opened is 2014-12 [1].
Moved. Thanks.
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQLTimbira: http://www.timbira.com.br
Blog: http://fabriziomello.github.io
Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello
Github: http://github.com/fabriziomello
--
Alexey Vasiliev
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
On 2014-11-03 14:04:00 +0300, Alexey Vasiliev wrote:
* What the patch does in a short paragraph:�This patch should add option�recovery_timeout, which help to control timeout of restore_command�nonzero status code. Right now default value is 5 seconds. This is useful, if I using for restore of wal logs some external storage (like AWS S3) and�no matter what the slave database will�lag behind the master. The problem, what for each request to AWS S3 need to pay, what is why for N nodes, which try to get next wal log each 5 seconds will be bigger price, than for example each 30 seconds. Before I do this in this way: " if ! (/usr/local/bin/envdir /etc/wal-e.d/env /usr/local/bin/wal-e wal-fetch "%f" "%p"); then sleep 60; fi ". But in this case restart/stop database slower.
Without saying that the feature is unneccessary, wouldn't this better be
solved by using streaming rep most of the time?
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Tue, 4 Nov 2014 14:41:56 +0100 от Andres Freund <andres@2ndquadrant.com>:
Hi,
On 2014-11-03 14:04:00 +0300, Alexey Vasiliev wrote:
* What the patch does in a short paragraph: This patch should add option recovery_timeout, which help to control timeout of restore_command nonzero status code. Right now default value is 5 seconds. This is useful, if I using for restore of wal logs some external storage (like AWS S3) and no matter what the slave database will lag behind the master. The problem, what for each request to AWS S3 need to pay, what is why for N nodes, which try to get next wal log each 5 seconds will be bigger price, than for example each 30 seconds. Before I do this in this way: " if ! (/usr/local/bin/envdir /etc/wal-e.d/env /usr/local/bin/wal-e wal-fetch "%f" "%p"); then sleep 60; fi ". But in this case restart/stop database slower.
Without saying that the feature is unneccessary, wouldn't this better be
solved by using streaming rep most of the time?
But we don't need streaming rep. Master database no need to know about slaves (and no need to add this little overhead to master). Slaves read wal logs from S3 and the same S3 wal logs used as backups.
--
Alexey Vasiliev
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Nov 4, 2014 at 6:25 AM, Alexey Vasiliev <leopard_ne@inbox.ru> wrote:
Added new patch.
Seems useful to me to be able to tune this interval of time.
I would simply reword the documentation as follows:
If <varname>restore_command</> returns nonzero exit status code, retry
command after the interval of time specified by this parameter.
Default value is <literal>5s</>.
Also, I think that it would be a good idea to error out if this
parameter has a value of let's say, less than 1s instead of doing a
check for a positive value in the waiting latch. On top of that, the
default value of restore_command_retry_interval should be 5000 and not
0 to simplify the code.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Thanks for suggestions.
Patch updated.
Mon, 22 Dec 2014 12:07:06 +0900 от Michael Paquier <michael.paquier@gmail.com>:
On Tue, Nov 4, 2014 at 6:25 AM, Alexey Vasiliev < leopard_ne@inbox.ru > wrote:
Added new patch.
Seems useful to me to be able to tune this interval of time.
I would simply reword the documentation as follows:
If <varname>restore_command</> returns nonzero exit status code, retry
command after the interval of time specified by this parameter.
Default value is <literal>5s</>.Also, I think that it would be a good idea to error out if this
parameter has a value of let's say, less than 1s instead of doing a
check for a positive value in the waiting latch. On top of that, the
default value of restore_command_retry_interval should be 5000 and not
0 to simplify the code.
--
Michael--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Alexey Vasiliev
Attachments:
0001-add-recovery_timeout-to-controll-timeout-between-res.patchapplication/x-patch; name="=?UTF-8?B?MDAwMS1hZGQtcmVjb3ZlcnlfdGltZW91dC10by1jb250cm9sbC10aW1lb3V0?= =?UTF-8?B?LWJldHdlZW4tcmVzLnBhdGNo?="Download
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index ef78bc0..53ccf13 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -145,6 +145,29 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
</listitem>
</varlistentry>
+ <varlistentry id="restore-command-retry-interval" xreflabel="restore_command_retry_interval">
+ <term><varname>restore_command_retry_interval</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>restore_command_retry_interval</> recovery parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ If <varname>restore_command</> returns nonzero exit status code, retry
+ command after the interval of time specified by this parameter.
+ Default value is <literal>5s</>.
+ </para>
+ <para>
+ This is useful, if I using for restore of wal logs some
+ external storage (like AWS S3) and no matter what the slave database
+ will lag behind the master. The problem, what for each request to
+ AWS S3 need to pay, what is why for N nodes, which try to get next
+ wal log each 5 seconds will be bigger price, than for example each
+ 30 seconds.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect1>
diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample
index b777400..7ed6f3b 100644
--- a/src/backend/access/transam/recovery.conf.sample
+++ b/src/backend/access/transam/recovery.conf.sample
@@ -58,6 +58,11 @@
#
#recovery_end_command = ''
#
+# specifies an optional timeout after nonzero code of restore_command.
+# This can be useful to increase/decrease number of a restore_command calls.
+#
+#restore_command_retry_interval = 5s
+#
#---------------------------------------------------------------------------
# RECOVERY TARGET PARAMETERS
#---------------------------------------------------------------------------
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index e5dddd4..02c55a8 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -235,6 +235,7 @@ static TimestampTz recoveryTargetTime;
static char *recoveryTargetName;
static int recovery_min_apply_delay = 0;
static TimestampTz recoveryDelayUntilTime;
+static int restore_command_retry_interval = 5000L;
/* options taken from recovery.conf for XLOG streaming */
static bool StandbyModeRequested = false;
@@ -4881,6 +4882,28 @@ readRecoveryCommandFile(void)
(errmsg_internal("trigger_file = '%s'",
TriggerFile)));
}
+ else if (strcmp(item->name, "restore_command_retry_interval") == 0)
+ {
+ const char *hintmsg;
+
+ if (!parse_int(item->value, &restore_command_retry_interval, GUC_UNIT_MS,
+ &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("parameter \"%s\" requires a temporal value",
+ "restore_command_retry_interval"),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+ ereport(DEBUG2,
+ (errmsg_internal("restore_command_retry_interval = '%s'", item->value)));
+
+ if (restore_command_retry_interval <= 0)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("\"%s\" must be bigger zero",
+ "restore_command_retry_interval")));
+ }
+ }
else if (strcmp(item->name, "recovery_min_apply_delay") == 0)
{
const char *hintmsg;
@@ -10658,13 +10681,14 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
}
/*
- * Wait for more WAL to arrive. Time out after 5 seconds,
+ * Wait for more WAL to arrive. Time out after
+ * restore_command_retry_interval (5 seconds by default),
* like when polling the archive, to react to a trigger
* file promptly.
*/
WaitLatch(&XLogCtl->recoveryWakeupLatch,
WL_LATCH_SET | WL_TIMEOUT,
- 5000L);
+ restore_command_retry_interval);
ResetLatch(&XLogCtl->recoveryWakeupLatch);
break;
}
On Sat, Dec 27, 2014 at 3:42 AM, Alexey Vasiliev <leopard_ne@inbox.ru> wrote:
Thanks for suggestions.
Patch updated.
Cool, thanks. I just had an extra look at it.
+ This is useful, if I using for restore of wal logs some
+ external storage (like AWS S3) and no matter what the slave database
+ will lag behind the master. The problem, what for each request to
+ AWS S3 need to pay, what is why for N nodes, which try to get next
+ wal log each 5 seconds will be bigger price, than for example each
+ 30 seconds.
I reworked this portion of the docs, it is rather incorrect as the
documentation should not use first-person subjects, and I don't
believe that referencing any commercial products is a good thing in
this context.
+# specifies an optional timeout after nonzero code of restore_command.
+# This can be useful to increase/decrease number of a restore_command calls.
This is still referring to a timeout. That's not good. And the name of
the parameter at the top of this comment block is missing.
+static int restore_command_retry_interval = 5000L;
I think that it would be more adapted to set that to 5000, and
multiply by 1L. I am also wondering about having a better lower bound,
like 100ms to avoid some abuse with this feature in the retries?
+ ereport(ERROR,
+
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("\"%s\" must
be bigger zero",
+ "restore_command_retry_interval")));
I'd rather rewrite that to "must have a strictly positive value".
- * Wait for more WAL to
arrive. Time out after 5 seconds,
+ * Wait for more WAL to
arrive. Time out after
+ *
restore_command_retry_interval (5 seconds by default),
* like when polling the
archive, to react to a trigger
* file promptly.
*/
WaitLatch(&XLogCtl->recoveryWakeupLatch,
WL_LATCH_SET
| WL_TIMEOUT,
- 5000L);
+
restore_command_retry_interval);
I should have noticed earlier, but in its current state your patch
actually does not work. What you are doing here is tuning the time
process waits for WAL from stream. In your case what you want to
control is the retry time for a restore_command in archive recovery,
no?
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello.
Thanks, I understand, what look in another part of code. Hope right now I did right changes.
To not modify current pg_usleep calculation, I changed restore_command_retry_interval value to seconds (not milliseconds). In this case, min value - 1 second.
Mon, 29 Dec 2014 00:15:03 +0900 от Michael Paquier <michael.paquier@gmail.com>:
On Sat, Dec 27, 2014 at 3:42 AM, Alexey Vasiliev < leopard_ne@inbox.ru > wrote:
Thanks for suggestions.
Patch updated.
Cool, thanks. I just had an extra look at it.
+ This is useful, if I using for restore of wal logs some + external storage (like AWS S3) and no matter what the slave database + will lag behind the master. The problem, what for each request to + AWS S3 need to pay, what is why for N nodes, which try to get next + wal log each 5 seconds will be bigger price, than for example each + 30 seconds. I reworked this portion of the docs, it is rather incorrect as the documentation should not use first-person subjects, and I don't believe that referencing any commercial products is a good thing in this context.+# specifies an optional timeout after nonzero code of restore_command. +# This can be useful to increase/decrease number of a restore_command calls. This is still referring to a timeout. That's not good. And the name of the parameter at the top of this comment block is missing.+static int restore_command_retry_interval = 5000L;
I think that it would be more adapted to set that to 5000, and
multiply by 1L. I am also wondering about having a better lower bound,
like 100ms to avoid some abuse with this feature in the retries?+ ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("\"%s\" must be bigger zero", + "restore_command_retry_interval"))); I'd rather rewrite that to "must have a strictly positive value".- * Wait for more WAL to arrive. Time out after 5 seconds, + * Wait for more WAL to arrive. Time out after + * restore_command_retry_interval (5 seconds by default), * like when polling the archive, to react to a trigger * file promptly. */ WaitLatch(&XLogCtl->recoveryWakeupLatch, WL_LATCH_SET | WL_TIMEOUT, - 5000L); + restore_command_retry_interval); I should have noticed earlier, but in its current state your patch actually does not work. What you are doing here is tuning the time process waits for WAL from stream. In your case what you want to control is the retry time for a restore_command in archive recovery, no? -- Michael--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Alexey Vasiliev
Attachments:
0001-add-recovery_timeout-to-controll-timeout-between-res.patchapplication/x-patch; name="=?UTF-8?B?MDAwMS1hZGQtcmVjb3ZlcnlfdGltZW91dC10by1jb250cm9sbC10aW1lb3V0?= =?UTF-8?B?LWJldHdlZW4tcmVzLnBhdGNo?="Download
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index ef78bc0..38420a5 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -145,6 +145,26 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
</listitem>
</varlistentry>
+ <varlistentry id="restore-command-retry-interval" xreflabel="restore_command_retry_interval">
+ <term><varname>restore_command_retry_interval</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>restore_command_retry_interval</> recovery parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ If <varname>restore_command</> returns nonzero exit status code, retry
+ command after the interval of time specified by this parameter.
+ Default value is <literal>5s</>.
+ </para>
+ <para>
+ This is useful, if I using for restore of wal logs some
+ external storage and no matter what the slave database
+ will lag behind the master.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect1>
diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample
index b777400..5b63f60 100644
--- a/src/backend/access/transam/recovery.conf.sample
+++ b/src/backend/access/transam/recovery.conf.sample
@@ -58,6 +58,11 @@
#
#recovery_end_command = ''
#
+# specifies an optional retry interval of restore_command command, if previous return nonzero exit status code.
+# This can be useful to increase/decrease number of a restore_command calls.
+#
+#restore_command_retry_interval = 5s
+#
#---------------------------------------------------------------------------
# RECOVERY TARGET PARAMETERS
#---------------------------------------------------------------------------
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index e5dddd4..83a6db0 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -235,6 +235,7 @@ static TimestampTz recoveryTargetTime;
static char *recoveryTargetName;
static int recovery_min_apply_delay = 0;
static TimestampTz recoveryDelayUntilTime;
+static int restore_command_retry_interval = 5;
/* options taken from recovery.conf for XLOG streaming */
static bool StandbyModeRequested = false;
@@ -4881,6 +4882,28 @@ readRecoveryCommandFile(void)
(errmsg_internal("trigger_file = '%s'",
TriggerFile)));
}
+ else if (strcmp(item->name, "restore_command_retry_interval") == 0)
+ {
+ const char *hintmsg;
+
+ if (!parse_int(item->value, &restore_command_retry_interval, GUC_UNIT_S,
+ &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("parameter \"%s\" requires a temporal value",
+ "restore_command_retry_interval"),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+ ereport(DEBUG2,
+ (errmsg_internal("restore_command_retry_interval = '%s'", item->value)));
+
+ if (restore_command_retry_interval < 1)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("\"%s\" must have a strictly positive value",
+ "restore_command_retry_interval")));
+ }
+ }
else if (strcmp(item->name, "recovery_min_apply_delay") == 0)
{
const char *hintmsg;
@@ -10495,13 +10518,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
* machine, so we've exhausted all the options for
* obtaining the requested WAL. We're going to loop back
* and retry from the archive, but if it hasn't been long
- * since last attempt, sleep 5 seconds to avoid
- * busy-waiting.
+ * since last attempt, sleep restore_command_retry_interval
+ * (by default 5 seconds) to avoid busy-waiting.
*/
now = (pg_time_t) time(NULL);
- if ((now - last_fail_time) < 5)
+ if ((now - last_fail_time) < restore_command_retry_interval)
{
- pg_usleep(1000000L * (5 - (now - last_fail_time)));
+ pg_usleep(1000000L * (restore_command_retry_interval - (now - last_fail_time)));
now = (pg_time_t) time(NULL);
}
last_fail_time = now;
On Tue, Dec 30, 2014 at 9:10 PM, Alexey Vasiliev <leopard_ne@inbox.ru> wrote:
To not modify current pg_usleep calculation, I changed
restore_command_retry_interval value to seconds (not milliseconds). In this
case, min value - 1 second.
Er, what the problem with not changing 1000000L to 1000L? The unit of
your parameter is ms AFAIK.
Also, I am not really getting the meaning of this paragraph:
+ <para>
+ This is useful, if I using for restore of wal logs some
+ external storage and no matter what the slave database
+ will lag behind the master.
+ </para>
Could you be more explicit here? What do you want to mean here?
(btw, please let's keep the thread readable and not reply at the top
of each post).
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Dec 30, 2014 at 9:33 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
On Tue, Dec 30, 2014 at 9:10 PM, Alexey Vasiliev <leopard_ne@inbox.ru> wrote:
To not modify current pg_usleep calculation, I changed
restore_command_retry_interval value to seconds (not milliseconds). In this
case, min value - 1 second.Er, what the problem with not changing 1000000L to 1000L? The unit of
your parameter is ms AFAIK.
Of course I meant in the previous version of the patch not the current
one. Wouldn't it be useful to use it with for example retry intervals
of the order of 100ms~300ms for some cases?
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Tue, 30 Dec 2014 21:39:33 +0900 от Michael Paquier <michael.paquier@gmail.com>:
On Tue, Dec 30, 2014 at 9:33 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:On Tue, Dec 30, 2014 at 9:10 PM, Alexey Vasiliev <leopard_ne@inbox.ru> wrote:
To not modify current pg_usleep calculation, I changed
restore_command_retry_interval value to seconds (not milliseconds). In this
case, min value - 1 second.Er, what the problem with not changing 1000000L to 1000L? The unit of
your parameter is ms AFAIK.Of course I meant in the previous version of the patch not the current
one. Wouldn't it be useful to use it with for example retry intervals
of the order of 100ms~300ms for some cases?
--
Michael--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Thanks, patch changed.
As I understand now = (pg_time_t) time(NULL); return time in seconds, what is why I multiply value to 1000 to compare with restore_command_retry_interval in milliseconds.
I am not sure about small retry interval of time, in my cases I need interval bigger 5 seconds (20-40 seconds). Right now I limiting this value be bigger 100 milliseconds.
--
Alexey Vasiliev
Attachments:
0001-add-recovery_timeout-to-controll-timeout-between-res.patchapplication/x-patch; name="=?UTF-8?B?MDAwMS1hZGQtcmVjb3ZlcnlfdGltZW91dC10by1jb250cm9sbC10aW1lb3V0?= =?UTF-8?B?LWJldHdlZW4tcmVzLnBhdGNo?="Download
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index ef78bc0..52cb47d 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -145,6 +145,29 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
</listitem>
</varlistentry>
+ <varlistentry id="restore-command-retry-interval" xreflabel="restore_command_retry_interval">
+ <term><varname>restore_command_retry_interval</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>restore_command_retry_interval</> recovery parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ If <varname>restore_command</> returns nonzero exit status code, retry
+ command after the interval of time specified by this parameter.
+ Default value is <literal>5s</>.
+ </para>
+ <para>
+ This is useful, if I using for restore of wal logs some
+ external storage and request each 5 seconds for wal logs
+ can be useless and cost additional money.
+ Increasing this parameter allow to increase retry interval of time,
+ if not found new wal logs inside external storage and no matter
+ what the slave database will lag behind the master.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect1>
diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample
index b777400..5b63f60 100644
--- a/src/backend/access/transam/recovery.conf.sample
+++ b/src/backend/access/transam/recovery.conf.sample
@@ -58,6 +58,11 @@
#
#recovery_end_command = ''
#
+# specifies an optional retry interval of restore_command command, if previous return nonzero exit status code.
+# This can be useful to increase/decrease number of a restore_command calls.
+#
+#restore_command_retry_interval = 5s
+#
#---------------------------------------------------------------------------
# RECOVERY TARGET PARAMETERS
#---------------------------------------------------------------------------
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index e5dddd4..67a873c 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -235,6 +235,7 @@ static TimestampTz recoveryTargetTime;
static char *recoveryTargetName;
static int recovery_min_apply_delay = 0;
static TimestampTz recoveryDelayUntilTime;
+static int restore_command_retry_interval = 5000;
/* options taken from recovery.conf for XLOG streaming */
static bool StandbyModeRequested = false;
@@ -4881,6 +4882,28 @@ readRecoveryCommandFile(void)
(errmsg_internal("trigger_file = '%s'",
TriggerFile)));
}
+ else if (strcmp(item->name, "restore_command_retry_interval") == 0)
+ {
+ const char *hintmsg;
+
+ if (!parse_int(item->value, &restore_command_retry_interval, GUC_UNIT_MS,
+ &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("parameter \"%s\" requires a temporal value",
+ "restore_command_retry_interval"),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+ ereport(DEBUG2,
+ (errmsg_internal("restore_command_retry_interval = '%s'", item->value)));
+
+ if (restore_command_retry_interval < 100)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("\"%s\" must have a bigger 100 milliseconds value",
+ "restore_command_retry_interval")));
+ }
+ }
else if (strcmp(item->name, "recovery_min_apply_delay") == 0)
{
const char *hintmsg;
@@ -10495,13 +10518,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
* machine, so we've exhausted all the options for
* obtaining the requested WAL. We're going to loop back
* and retry from the archive, but if it hasn't been long
- * since last attempt, sleep 5 seconds to avoid
- * busy-waiting.
+ * since last attempt, sleep restore_command_retry_interval
+ * (by default 5 seconds) to avoid busy-waiting.
*/
now = (pg_time_t) time(NULL);
- if ((now - last_fail_time) < 5)
+ if (((now - last_fail_time) * 1000) < restore_command_retry_interval)
{
- pg_usleep(1000000L * (5 - (now - last_fail_time)));
+ pg_usleep(1000L * (restore_command_retry_interval - ((now - last_fail_time) * 1000)));
now = (pg_time_t) time(NULL);
}
last_fail_time = now;
On Wed, Dec 31, 2014 at 12:22 AM, Alexey Vasiliev <leopard_ne@inbox.ru> wrote:
Tue, 30 Dec 2014 21:39:33 +0900 от Michael Paquier <michael.paquier@gmail.com>:
As I understand now = (pg_time_t) time(NULL); return time in seconds, what is why I multiply value to 1000 to compare with restore_command_retry_interval in milliseconds.
This way of doing is incorrect, you would get a wait time of 1s even
for retries lower than 1s. In order to get the feature working
correctly and to get a comparison granularity sufficient, you need to
use TimetampTz for the tracking of the current and last failure time
and to use the APIs from TimestampTz for difference calculations.
I am not sure about small retry interval of time, in my cases I need interval bigger 5 seconds (20-40 seconds). Right now I limiting this value be bigger 100 milliseconds.
Other people may disagree here, but I don't see any reason to put
restrictions to this interval of time.
Attached is a patch fixing the feature to use TimestampTz, updating as
well the docs and recovery.conf.sample which was incorrect, on top of
other things I noticed here and there.
Alexey, does this new patch look fine for you?
--
Michael
Attachments:
20150105_restore_retry_interval.patchtext/x-diff; charset=US-ASCII; name=20150105_restore_retry_interval.patchDownload
From 9bb71e39b218885466a53c005a87361d4ae889fa Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@otacoo.com>
Date: Mon, 5 Jan 2015 17:10:13 +0900
Subject: [PATCH] Add restore_command_retry_interval to control retries of
restore_command
restore_command_retry_interval can be specified in recovery.conf to control
the interval of time between retries of restore_command when failures occur.
---
doc/src/sgml/recovery-config.sgml | 21 +++++++++++++
src/backend/access/transam/recovery.conf.sample | 9 ++++++
src/backend/access/transam/xlog.c | 42 ++++++++++++++++++++-----
3 files changed, 64 insertions(+), 8 deletions(-)
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index 0c64ff2..0488ae3 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -145,6 +145,27 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
</listitem>
</varlistentry>
+ <varlistentry id="restore-command-retry-interval" xreflabel="restore_command_retry_interval">
+ <term><varname>restore_command_retry_interval</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>restore_command_retry_interval</> recovery parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ If <varname>restore_command</> returns nonzero exit status code, retry
+ command after the interval of time specified by this parameter,
+ measured in milliseconds if no unit is specified. Default value is
+ <literal>5s</>.
+ </para>
+ <para>
+ This command is particularly useful for warm standby configurations
+ to leverage the amount of retries done to restore a WAL segment file
+ depending on the activity of a master node.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect1>
diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample
index b777400..8699a33 100644
--- a/src/backend/access/transam/recovery.conf.sample
+++ b/src/backend/access/transam/recovery.conf.sample
@@ -58,6 +58,15 @@
#
#recovery_end_command = ''
#
+#
+# restore_command_retry_interval
+#
+# specifies an optional retry interval for restore_command command, if
+# previous command returned nonzero exit status code. This can be useful
+# to increase or decrease the number of calls of restore_command.
+#
+#restore_command_retry_interval = '5s'
+#
#---------------------------------------------------------------------------
# RECOVERY TARGET PARAMETERS
#---------------------------------------------------------------------------
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 5cc7e47..1944e6f 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -235,6 +235,7 @@ static TimestampTz recoveryTargetTime;
static char *recoveryTargetName;
static int recovery_min_apply_delay = 0;
static TimestampTz recoveryDelayUntilTime;
+static int restore_command_retry_interval = 5000;
/* options taken from recovery.conf for XLOG streaming */
static bool StandbyModeRequested = false;
@@ -4881,6 +4882,26 @@ readRecoveryCommandFile(void)
(errmsg_internal("trigger_file = '%s'",
TriggerFile)));
}
+ else if (strcmp(item->name, "restore_command_retry_interval") == 0)
+ {
+ const char *hintmsg;
+
+ if (!parse_int(item->value, &restore_command_retry_interval, GUC_UNIT_MS,
+ &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("parameter \"%s\" requires a temporal value",
+ "restore_command_retry_interval"),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+ ereport(DEBUG2,
+ (errmsg_internal("restore_command_retry_interval = '%s'", item->value)));
+
+ if (restore_command_retry_interval <= 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("\"%s\" must have a value strictly positive",
+ "restore_command_retry_interval")));
+ }
else if (strcmp(item->name, "recovery_min_apply_delay") == 0)
{
const char *hintmsg;
@@ -10345,8 +10366,8 @@ static bool
WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
bool fetching_ckpt, XLogRecPtr tliRecPtr)
{
- static pg_time_t last_fail_time = 0;
- pg_time_t now;
+ TimestampTz now = GetCurrentTimestamp();
+ TimestampTz last_fail_time = now;
/*-------
* Standby mode is implemented by a state machine:
@@ -10495,14 +10516,19 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
* machine, so we've exhausted all the options for
* obtaining the requested WAL. We're going to loop back
* and retry from the archive, but if it hasn't been long
- * since last attempt, sleep 5 seconds to avoid
- * busy-waiting.
+ * since last attempt, sleep the amount of time specified
+ * by restore_command_retry_interval to avoid busy-waiting.
*/
- now = (pg_time_t) time(NULL);
- if ((now - last_fail_time) < 5)
+ now = GetCurrentTimestamp();
+ if (!TimestampDifferenceExceeds(last_fail_time, now,
+ restore_command_retry_interval))
{
- pg_usleep(1000000L * (5 - (now - last_fail_time)));
- now = (pg_time_t) time(NULL);
+ long secs;
+ int microsecs;
+
+ TimestampDifference(last_fail_time, now, &secs, µsecs);
+ pg_usleep(restore_command_retry_interval * 1000L - (1000000L * secs + 1L * microsecs));
+ now = GetCurrentTimestamp();
}
last_fail_time = now;
currentSource = XLOG_FROM_ARCHIVE;
--
2.2.1
Mon, 5 Jan 2015 17:16:43 +0900 от Michael Paquier <michael.paquier@gmail.com>:
On Wed, Dec 31, 2014 at 12:22 AM, Alexey Vasiliev <leopard_ne@inbox.ru> wrote:
Tue, 30 Dec 2014 21:39:33 +0900 от Michael Paquier <michael.paquier@gmail.com>:
As I understand now = (pg_time_t) time(NULL); return time in seconds, what is why I multiply value to 1000 to compare with restore_command_retry_interval in milliseconds.This way of doing is incorrect, you would get a wait time of 1s even
for retries lower than 1s. In order to get the feature working
correctly and to get a comparison granularity sufficient, you need to
use TimetampTz for the tracking of the current and last failure time
and to use the APIs from TimestampTz for difference calculations.I am not sure about small retry interval of time, in my cases I need interval bigger 5 seconds (20-40 seconds). Right now I limiting this value be bigger 100 milliseconds.
Other people may disagree here, but I don't see any reason to put
restrictions to this interval of time.Attached is a patch fixing the feature to use TimestampTz, updating as
well the docs and recovery.conf.sample which was incorrect, on top of
other things I noticed here and there.Alexey, does this new patch look fine for you?
--
Michael
Hello. Thanks for help. Yes, new patch look fine!
--
Alexey Vasiliev
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jan 5, 2015 at 10:39 PM, Alexey Vasiliev <leopard_ne@inbox.ru> wrote:
Hello. Thanks for help. Yes, new patch look fine!
OK cool. Let's get an opinion from a committer then.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
On 2015-01-05 17:16:43 +0900, Michael Paquier wrote:
+ <varlistentry id="restore-command-retry-interval" xreflabel="restore_command_retry_interval"> + <term><varname>restore_command_retry_interval</varname> (<type>integer</type>) + <indexterm> + <primary><varname>restore_command_retry_interval</> recovery parameter</primary> + </indexterm> + </term> + <listitem> + <para> + If <varname>restore_command</> returns nonzero exit status code, retry + command after the interval of time specified by this parameter, + measured in milliseconds if no unit is specified. Default value is + <literal>5s</>. + </para> + <para> + This command is particularly useful for warm standby configurations + to leverage the amount of retries done to restore a WAL segment file + depending on the activity of a master node. + </para>
Don't really like this paragraph. What's "leveraging the amount of
retries done" supposed to mean?
+# restore_command_retry_interval +# +# specifies an optional retry interval for restore_command command, if +# previous command returned nonzero exit status code. This can be useful +# to increase or decrease the number of calls of restore_command.
It's not really the number but frequency, right?
+ else if (strcmp(item->name, "restore_command_retry_interval") == 0) + { + const char *hintmsg; + + if (!parse_int(item->value, &restore_command_retry_interval, GUC_UNIT_MS, + &hintmsg)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("parameter \"%s\" requires a temporal value", + "restore_command_retry_interval"), + hintmsg ? errhint("%s", _(hintmsg)) : 0));
"temporal value" sounds odd to my ears. I'd rather keep in line with
what guc.c outputs in such a case.
+ now = GetCurrentTimestamp(); + if (!TimestampDifferenceExceeds(last_fail_time, now, + restore_command_retry_interval)) { - pg_usleep(1000000L * (5 - (now - last_fail_time))); - now = (pg_time_t) time(NULL); + long secs; + int microsecs; + + TimestampDifference(last_fail_time, now, &secs, µsecs); + pg_usleep(restore_command_retry_interval * 1000L - (1000000L * secs + 1L * microsecs)); + now = GetCurrentTimestamp(); } last_fail_time = now; currentSource = XLOG_FROM_ARCHIVE;
Am I missing something or does this handle/affect streaming failures
just the same? That might not be a bad idea, because the current
interval is far too long for e.g. a sync replication setup. But it
certainly makes the name a complete misnomer.
Not this patch's fault, but I'm getting a bit tired seing the above open
coded. How about adding a function that does the sleeping based on a
timestamptz and a ms interval?
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Jan 17, 2015 at 2:44 AM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2015-01-05 17:16:43 +0900, Michael Paquier wrote:
+ <varlistentry id="restore-command-retry-interval" xreflabel="restore_command_retry_interval"> + <term><varname>restore_command_retry_interval</varname> (<type>integer</type>) + <indexterm> + <primary><varname>restore_command_retry_interval</> recovery parameter</primary> + </indexterm> + </term> + <listitem> + <para> + If <varname>restore_command</> returns nonzero exit status code, retry + command after the interval of time specified by this parameter, + measured in milliseconds if no unit is specified. Default value is + <literal>5s</>. + </para> + <para> + This command is particularly useful for warm standby configurations + to leverage the amount of retries done to restore a WAL segment file + depending on the activity of a master node. + </para>Don't really like this paragraph. What's "leveraging the amount of
retries done" supposed to mean?
Actually I have reworked the whole paragraph with the renaming of the
parameter. Hopefully that's more clear.
+# restore_command_retry_interval +# +# specifies an optional retry interval for restore_command command, if +# previous command returned nonzero exit status code. This can be useful +# to increase or decrease the number of calls of restore_command.It's not really the number but frequency, right?
Sure.
+ else if (strcmp(item->name, "restore_command_retry_interval") == 0) + { + const char *hintmsg; + + if (!parse_int(item->value, &restore_command_retry_interval, GUC_UNIT_MS, + &hintmsg)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("parameter \"%s\" requires a temporal value", + "restore_command_retry_interval"), + hintmsg ? errhint("%s", _(hintmsg)) : 0));"temporal value" sounds odd to my ears. I'd rather keep in line with
what guc.c outputs in such a case.
Yeah, I have been pondering about that a bit and I think that
wal_availability_check_interval is the closest thing as we want to
check if WAL is available for a walreceiver at record level and at
segment level for a restore_command.
Am I missing something or does this handle/affect streaming failures
just the same? That might not be a bad idea, because the current
interval is far too long for e.g. a sync replication setup. But it
certainly makes the name a complete misnomer.
Hm.
Not this patch's fault, but I'm getting a bit tired seeing the above open
coded. How about adding a function that does the sleeping based on a
timestamptz and a ms interval?
You mean in plugins, right? I don't recall seeing similar patterns in
other code paths of backend. But I think that we can use something
like that in timestamp.c then because we need to leverage that between
two timestamps, the last failure and now():
TimestampSleepDifference(start_time, stop_time, internal_ms);
Perhaps you have something else in mind?
Attached is an updated patch.
--
Michael
Attachments:
20150119_wal_avail_check.patchtext/x-diff; charset=US-ASCII; name=20150119_wal_avail_check.patchDownload
From 0206c417f5b28eadb5f9d67ee0643320d7c0b621 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@otacoo.com>
Date: Mon, 19 Jan 2015 16:08:48 +0900
Subject: [PATCH] Add wal_availability_check_interval to control WAL fetching
on failure
Default value is 5s.
---
doc/src/sgml/recovery-config.sgml | 21 +++++++++++++
src/backend/access/transam/recovery.conf.sample | 10 +++++++
src/backend/access/transam/xlog.c | 39 ++++++++++++++++++++-----
src/backend/utils/adt/timestamp.c | 19 ++++++++++++
src/include/utils/timestamp.h | 2 ++
5 files changed, 83 insertions(+), 8 deletions(-)
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index 0c64ff2..7fd51ce 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -145,6 +145,27 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
</listitem>
</varlistentry>
+ <varlistentry id="wal-availability-check-interval" xreflabel="wal_availability_check_interval">
+ <term><varname>wal_availability_check_interval</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>wal_availability_check_interval</> recovery parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ This parameter specifies the amount of time to wait when
+ WAL is not available for a node in recovery. Default value is
+ <literal>5s</>.
+ </para>
+ <para>
+ A node in recovery will wait for this amount of time if
+ <varname>restore_command</> returns nonzero exit status code when
+ fetching new WAL segment files from archive or when a WAL receiver
+ is not able to fetch a WAL record when using streaming replication.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect1>
diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample
index b777400..70d3946 100644
--- a/src/backend/access/transam/recovery.conf.sample
+++ b/src/backend/access/transam/recovery.conf.sample
@@ -58,6 +58,16 @@
#
#recovery_end_command = ''
#
+#
+# wal_availability_check_interval
+#
+# specifies an optional interval to check for availability of WAL when
+# recovering a node. This interval of time represents the frequency of
+# retries if a previous command of restore_command returned nonzero exit
+# status code or if a walreceiver did not stream completely a WAL record.
+#
+#wal_availability_check_interval = '5s'
+#
#---------------------------------------------------------------------------
# RECOVERY TARGET PARAMETERS
#---------------------------------------------------------------------------
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 629a457..0da9f5c 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -235,6 +235,7 @@ static TimestampTz recoveryTargetTime;
static char *recoveryTargetName;
static int recovery_min_apply_delay = 0;
static TimestampTz recoveryDelayUntilTime;
+static int wal_availability_check_interval = 5000;
/* options taken from recovery.conf for XLOG streaming */
static bool StandbyModeRequested = false;
@@ -4881,6 +4882,26 @@ readRecoveryCommandFile(void)
(errmsg_internal("trigger_file = '%s'",
TriggerFile)));
}
+ else if (strcmp(item->name, "wal_availability_check_interval") == 0)
+ {
+ const char *hintmsg;
+
+ if (!parse_int(item->value, &wal_availability_check_interval, GUC_UNIT_MS,
+ &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("parameter \"%s\" requires a temporal value",
+ "wal_availability_check_interval"),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+ ereport(DEBUG2,
+ (errmsg_internal("wal_availability_check_interval = '%s'", item->value)));
+
+ if (wal_availability_check_interval <= 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("\"%s\" must have a value strictly positive",
+ "wal_availability_check_interval")));
+ }
else if (strcmp(item->name, "recovery_min_apply_delay") == 0)
{
const char *hintmsg;
@@ -10340,8 +10361,8 @@ static bool
WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
bool fetching_ckpt, XLogRecPtr tliRecPtr)
{
- static pg_time_t last_fail_time = 0;
- pg_time_t now;
+ TimestampTz now = GetCurrentTimestamp();
+ TimestampTz last_fail_time = now;
/*-------
* Standby mode is implemented by a state machine:
@@ -10490,14 +10511,16 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
* machine, so we've exhausted all the options for
* obtaining the requested WAL. We're going to loop back
* and retry from the archive, but if it hasn't been long
- * since last attempt, sleep 5 seconds to avoid
- * busy-waiting.
+ * since last attempt, sleep the amount of time specified
+ * by wal_availability_check_interval to avoid busy-waiting.
*/
- now = (pg_time_t) time(NULL);
- if ((now - last_fail_time) < 5)
+ now = GetCurrentTimestamp();
+ if (!TimestampDifferenceExceeds(last_fail_time, now,
+ wal_availability_check_interval))
{
- pg_usleep(1000000L * (5 - (now - last_fail_time)));
- now = (pg_time_t) time(NULL);
+ TimestampSleepDifference(last_fail_time, now,
+ wal_availability_check_interval);
+ now = GetCurrentTimestamp();
}
last_fail_time = now;
currentSource = XLOG_FROM_ARCHIVE;
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 67e0cf9..1098fc9 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1674,6 +1674,25 @@ TimestampDifferenceExceeds(TimestampTz start_time,
}
/*
+ * TimestampSleepDifference -- sleep for the amout of interval time
+ * specified, reduced by the difference between two timestamps.
+ *
+ * Both inputs must be ordinary finite timestamps (in current usage,
+ * they'll be results from GetCurrentTimestamp()).
+ */
+void
+TimestampSleepDifference(TimestampTz start_time,
+ TimestampTz stop_time,
+ int interval_msec)
+{
+ long secs;
+ int microsecs;
+
+ TimestampDifference(start_time, stop_time, &secs, µsecs);
+ pg_usleep(interval_msec * 1000L - (1000000L * secs + 1L * microsecs));
+}
+
+/*
* Convert a time_t to TimestampTz.
*
* We do not use time_t internally in Postgres, but this is provided for use
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index 70118f5..e271d0b 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -217,6 +217,8 @@ extern void TimestampDifference(TimestampTz start_time, TimestampTz stop_time,
extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern void TimestampSleepDifference(TimestampTz start_time,
+ TimestampTz stop_time, int interval_msec);
/*
* Prototypes for functions to deal with integer timestamps, when the native
--
2.2.2
On Mon, Jan 19, 2015 at 4:10 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
On Sat, Jan 17, 2015 at 2:44 AM, Andres Freund <andres@2ndquadrant.com> wrote:
Am I missing something or does this handle/affect streaming failures
just the same? That might not be a bad idea, because the current
interval is far too long for e.g. a sync replication setup. But it
certainly makes the name a complete misnomer.Hm.
Sorry for the typo here, I meant that I agree on that. But I am sure
you guessed it..
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jan 19, 2015 at 4:10 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
On Sat, Jan 17, 2015 at 2:44 AM, Andres Freund <andres@2ndquadrant.com> wrote:
Not this patch's fault, but I'm getting a bit tired seeing the above open
coded. How about adding a function that does the sleeping based on a
timestamptz and a ms interval?You mean in plugins, right? I don't recall seeing similar patterns in
other code paths of backend. But I think that we can use something
like that in timestamp.c then because we need to leverage that between
two timestamps, the last failure and now():
TimestampSleepDifference(start_time, stop_time, internal_ms);
Perhaps you have something else in mind?Attached is an updated patch.
Actually I came with better than last patch by using a boolean flag as
return value of TimestampSleepDifference and use
TimestampDifferenceExceeds directly inside it.
Regards,
--
Michael
Attachments:
20150119_wal_avail_check_v2.patchtext/x-diff; charset=US-ASCII; name=20150119_wal_avail_check_v2.patchDownload
From d77429197838c0ad9d378aba08137d4ca0b384fd Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@otacoo.com>
Date: Mon, 19 Jan 2015 16:08:48 +0900
Subject: [PATCH] Add wal_availability_check_interval to control WAL fetching
on failure
Default value is 5s.
---
doc/src/sgml/recovery-config.sgml | 21 +++++++++++++
src/backend/access/transam/recovery.conf.sample | 10 +++++++
src/backend/access/transam/xlog.c | 39 ++++++++++++++++++-------
src/backend/utils/adt/timestamp.c | 24 +++++++++++++++
src/include/utils/timestamp.h | 2 ++
5 files changed, 86 insertions(+), 10 deletions(-)
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index 0c64ff2..7fd51ce 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -145,6 +145,27 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
</listitem>
</varlistentry>
+ <varlistentry id="wal-availability-check-interval" xreflabel="wal_availability_check_interval">
+ <term><varname>wal_availability_check_interval</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>wal_availability_check_interval</> recovery parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ This parameter specifies the amount of time to wait when
+ WAL is not available for a node in recovery. Default value is
+ <literal>5s</>.
+ </para>
+ <para>
+ A node in recovery will wait for this amount of time if
+ <varname>restore_command</> returns nonzero exit status code when
+ fetching new WAL segment files from archive or when a WAL receiver
+ is not able to fetch a WAL record when using streaming replication.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect1>
diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample
index b777400..70d3946 100644
--- a/src/backend/access/transam/recovery.conf.sample
+++ b/src/backend/access/transam/recovery.conf.sample
@@ -58,6 +58,16 @@
#
#recovery_end_command = ''
#
+#
+# wal_availability_check_interval
+#
+# specifies an optional interval to check for availability of WAL when
+# recovering a node. This interval of time represents the frequency of
+# retries if a previous command of restore_command returned nonzero exit
+# status code or if a walreceiver did not stream completely a WAL record.
+#
+#wal_availability_check_interval = '5s'
+#
#---------------------------------------------------------------------------
# RECOVERY TARGET PARAMETERS
#---------------------------------------------------------------------------
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 629a457..39b4e1c 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -235,6 +235,7 @@ static TimestampTz recoveryTargetTime;
static char *recoveryTargetName;
static int recovery_min_apply_delay = 0;
static TimestampTz recoveryDelayUntilTime;
+static int wal_availability_check_interval = 5000;
/* options taken from recovery.conf for XLOG streaming */
static bool StandbyModeRequested = false;
@@ -4881,6 +4882,26 @@ readRecoveryCommandFile(void)
(errmsg_internal("trigger_file = '%s'",
TriggerFile)));
}
+ else if (strcmp(item->name, "wal_availability_check_interval") == 0)
+ {
+ const char *hintmsg;
+
+ if (!parse_int(item->value, &wal_availability_check_interval, GUC_UNIT_MS,
+ &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("parameter \"%s\" requires a temporal value",
+ "wal_availability_check_interval"),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+ ereport(DEBUG2,
+ (errmsg_internal("wal_availability_check_interval = '%s'", item->value)));
+
+ if (wal_availability_check_interval <= 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("\"%s\" must have a value strictly positive",
+ "wal_availability_check_interval")));
+ }
else if (strcmp(item->name, "recovery_min_apply_delay") == 0)
{
const char *hintmsg;
@@ -10340,8 +10361,8 @@ static bool
WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
bool fetching_ckpt, XLogRecPtr tliRecPtr)
{
- static pg_time_t last_fail_time = 0;
- pg_time_t now;
+ TimestampTz now = GetCurrentTimestamp();
+ TimestampTz last_fail_time = now;
/*-------
* Standby mode is implemented by a state machine:
@@ -10490,15 +10511,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
* machine, so we've exhausted all the options for
* obtaining the requested WAL. We're going to loop back
* and retry from the archive, but if it hasn't been long
- * since last attempt, sleep 5 seconds to avoid
- * busy-waiting.
+ * since last attempt, sleep the amount of time specified
+ * by wal_availability_check_interval to avoid busy-waiting.
*/
- now = (pg_time_t) time(NULL);
- if ((now - last_fail_time) < 5)
- {
- pg_usleep(1000000L * (5 - (now - last_fail_time)));
- now = (pg_time_t) time(NULL);
- }
+ now = GetCurrentTimestamp();
+ if (TimestampSleepDifference(last_fail_time, now,
+ wal_availability_check_interval))
+ now = GetCurrentTimestamp();
last_fail_time = now;
currentSource = XLOG_FROM_ARCHIVE;
break;
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 67e0cf9..2b11c42 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1674,6 +1674,30 @@ TimestampDifferenceExceeds(TimestampTz start_time,
}
/*
+ * TimestampSleepDifference -- sleep for the amout of interval time
+ * specified, reduced by the difference between two timestamps.
+ * Returns true if sleep is done, false otherwise.
+ *
+ * Both inputs must be ordinary finite timestamps (in current usage,
+ * they'll be results from GetCurrentTimestamp()).
+ */
+bool
+TimestampSleepDifference(TimestampTz start_time,
+ TimestampTz stop_time,
+ int interval_msec)
+{
+ long secs;
+ int microsecs;
+
+ if (TimestampDifferenceExceeds(start_time, stop_time, interval_msec))
+ return false;
+
+ TimestampDifference(start_time, stop_time, &secs, µsecs);
+ pg_usleep(interval_msec * 1000L - (1000000L * secs + 1L * microsecs));
+ return true;
+}
+
+/*
* Convert a time_t to TimestampTz.
*
* We do not use time_t internally in Postgres, but this is provided for use
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index 70118f5..e4a7673 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -217,6 +217,8 @@ extern void TimestampDifference(TimestampTz start_time, TimestampTz stop_time,
extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern bool TimestampSleepDifference(TimestampTz start_time,
+ TimestampTz stop_time, int interval_msec);
/*
* Prototypes for functions to deal with integer timestamps, when the native
--
2.2.2
On 2015-01-19 17:16:11 +0900, Michael Paquier wrote:
On Mon, Jan 19, 2015 at 4:10 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:On Sat, Jan 17, 2015 at 2:44 AM, Andres Freund <andres@2ndquadrant.com> wrote:
Not this patch's fault, but I'm getting a bit tired seeing the above open
coded. How about adding a function that does the sleeping based on a
timestamptz and a ms interval?You mean in plugins, right? I don't recall seeing similar patterns in
other code paths of backend. But I think that we can use something
like that in timestamp.c then because we need to leverage that between
two timestamps, the last failure and now():
TimestampSleepDifference(start_time, stop_time, internal_ms);
Perhaps you have something else in mind?Attached is an updated patch.
Actually I came with better than last patch by using a boolean flag as
return value of TimestampSleepDifference and use
TimestampDifferenceExceeds directly inside it.
Subject: [PATCH] Add wal_availability_check_interval to control WAL fetching
on failure
I think that name isn't a very good. And its isn't very accurate
either.
How about wal_retrieve_retry_interval? Not very nice, but I think it's
still better than the above.
+ <varlistentry id="wal-availability-check-interval" xreflabel="wal_availability_check_interval"> + <term><varname>wal_availability_check_interval</varname> (<type>integer</type>) + <indexterm> + <primary><varname>wal_availability_check_interval</> recovery parameter</primary> + </indexterm> + </term> + <listitem> + <para> + This parameter specifies the amount of time to wait when + WAL is not available for a node in recovery. Default value is + <literal>5s</>. + </para> + <para> + A node in recovery will wait for this amount of time if + <varname>restore_command</> returns nonzero exit status code when + fetching new WAL segment files from archive or when a WAL receiver + is not able to fetch a WAL record when using streaming replication. + </para> + </listitem> + </varlistentry> + </variablelist>
Walreceiver doesn't wait that amount, but rather how long the connection
is intact. And restore_command may or may not retry.
/*------- * Standby mode is implemented by a state machine: @@ -10490,15 +10511,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess, * machine, so we've exhausted all the options for * obtaining the requested WAL. We're going to loop back * and retry from the archive, but if it hasn't been long - * since last attempt, sleep 5 seconds to avoid - * busy-waiting. + * since last attempt, sleep the amount of time specified + * by wal_availability_check_interval to avoid busy-waiting. */ - now = (pg_time_t) time(NULL); - if ((now - last_fail_time) < 5) - { - pg_usleep(1000000L * (5 - (now - last_fail_time))); - now = (pg_time_t) time(NULL); - } + now = GetCurrentTimestamp(); + if (TimestampSleepDifference(last_fail_time, now, + wal_availability_check_interval)) + now = GetCurrentTimestamp();
Not bad, that's much easier to read imo.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Jan 20, 2015 at 1:56 AM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2015-01-19 17:16:11 +0900, Michael Paquier wrote:
On Mon, Jan 19, 2015 at 4:10 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:On Sat, Jan 17, 2015 at 2:44 AM, Andres Freund <andres@2ndquadrant.com> wrote:
Not this patch's fault, but I'm getting a bit tired seeing the above open
coded. How about adding a function that does the sleeping based on a
timestamptz and a ms interval?You mean in plugins, right? I don't recall seeing similar patterns in
other code paths of backend. But I think that we can use something
like that in timestamp.c then because we need to leverage that between
two timestamps, the last failure and now():
TimestampSleepDifference(start_time, stop_time, internal_ms);
Perhaps you have something else in mind?Attached is an updated patch.
Actually I came with better than last patch by using a boolean flag as
return value of TimestampSleepDifference and use
TimestampDifferenceExceeds directly inside it.Subject: [PATCH] Add wal_availability_check_interval to control WAL fetching
on failureI think that name isn't a very good. And its isn't very accurate
either.
How about wal_retrieve_retry_interval? Not very nice, but I think it's
still better than the above.
Fine for me.
+ <varlistentry id="wal-availability-check-interval" xreflabel="wal_availability_check_interval"> + <term><varname>wal_availability_check_interval</varname> (<type>integer</type>) + <indexterm> + <primary><varname>wal_availability_check_interval</> recovery parameter</primary> + </indexterm> + </term> + <listitem> + <para> + This parameter specifies the amount of time to wait when + WAL is not available for a node in recovery. Default value is + <literal>5s</>. + </para> + <para> + A node in recovery will wait for this amount of time if + <varname>restore_command</> returns nonzero exit status code when + fetching new WAL segment files from archive or when a WAL receiver + is not able to fetch a WAL record when using streaming replication. + </para> + </listitem> + </varlistentry> + </variablelist>Walreceiver doesn't wait that amount, but rather how long the connection
is intact. And restore_command may or may not retry.
I changed this paragraph as follows:
+ This parameter specifies the amount of time to wait when a failure
+ occurred when reading WAL from a source (be it via streaming
+ replication, local <filename>pg_xlog</> or WAL archive) for a node
+ in standby mode, or when WAL is expected from a source. Default
+ value is <literal>5s</>.
Pondering more about this patch, I am getting the feeling that it is
not that much a win to explain in details in the doc each failure
depending on the source from which WAL is taken. But it is essential
to mention that the wait is done only for a node using standby mode.
Btw, I noticed two extra things that I think should be done for consistency:
- We should use as well wal_retrieve_retry_interval when waiting for
WAL to arrive after checking for the failures:
/*
- * Wait for more WAL to
arrive. Time out after 5 seconds,
- * like when polling the
archive, to react to a trigger
- * file promptly.
+ * Wait for more WAL to
arrive. Time out after
+ * wal_retrieve_retry_interval
ms, like when polling the archive
+ * to react to a trigger file promptly.
*/
WaitLatch(&XLogCtl->recoveryWakeupLatch,
WL_LATCH_SET
| WL_TIMEOUT,
- 5000L);
+
wal_retrieve_retry_interval * 1L);
-
Updated patch attached.
--
Michael
Attachments:
20150119_wal_avail_check_v3.patchapplication/x-patch; name=20150119_wal_avail_check_v3.patchDownload
From e07949030a676b033f4a563cfaac4687bcc37dca Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@otacoo.com>
Date: Mon, 19 Jan 2015 16:08:48 +0900
Subject: [PATCH] Add wal_retrieve_retry_interval to control WAL retrieval at
recovery
This parameter allows to control the amount of time process will wait
for WAL from a source when a failure occurred for a standby node, or
for the amount of time to wait when WAL is expected from a source.
Default value is 5s.
---
doc/src/sgml/recovery-config.sgml | 17 +++++++++
src/backend/access/transam/recovery.conf.sample | 9 +++++
src/backend/access/transam/xlog.c | 50 +++++++++++++++++--------
src/backend/utils/adt/timestamp.c | 24 ++++++++++++
src/include/utils/timestamp.h | 2 +
5 files changed, 87 insertions(+), 15 deletions(-)
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index 0c64ff2..becb2d3 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -145,6 +145,23 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
</listitem>
</varlistentry>
+ <varlistentry id="wal-retrieve-retry-interval" xreflabel="wal_retrieve_retry_interval">
+ <term><varname>wal_retrieve_retry_interval</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>wal_retrieve_retry_interval</> recovery parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ This parameter specifies the amount of time to wait for WAL to become
+ available when a failure occurred when reading WAL from a source (be
+ it via streaming replication, local <filename>pg_xlog</> or WAL archive)
+ for a node in standby mode, or when WAL is expected from a source.
+ Default value is <literal>5s</>.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect1>
diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample
index b777400..458308c 100644
--- a/src/backend/access/transam/recovery.conf.sample
+++ b/src/backend/access/transam/recovery.conf.sample
@@ -58,6 +58,15 @@
#
#recovery_end_command = ''
#
+#
+# wal_retrieve_retry_interval
+#
+# specifies an optional internal to wait for WAL to become available when
+# a failure occurred when reading WAL from a source for a node in standby
+# mode, or when WAL is expected from a source.
+#
+#wal_retrieve_retry_interval = '5s'
+#
#---------------------------------------------------------------------------
# RECOVERY TARGET PARAMETERS
#---------------------------------------------------------------------------
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 629a457..76bd5e2 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -235,6 +235,7 @@ static TimestampTz recoveryTargetTime;
static char *recoveryTargetName;
static int recovery_min_apply_delay = 0;
static TimestampTz recoveryDelayUntilTime;
+static int wal_retrieve_retry_interval = 5000;
/* options taken from recovery.conf for XLOG streaming */
static bool StandbyModeRequested = false;
@@ -4881,6 +4882,26 @@ readRecoveryCommandFile(void)
(errmsg_internal("trigger_file = '%s'",
TriggerFile)));
}
+ else if (strcmp(item->name, "wal_retrieve_retry_interval") == 0)
+ {
+ const char *hintmsg;
+
+ if (!parse_int(item->value, &wal_retrieve_retry_interval, GUC_UNIT_MS,
+ &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("parameter \"%s\" requires a temporal value",
+ "wal_retrieve_retry_interval"),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+ ereport(DEBUG2,
+ (errmsg_internal("wal_retrieve_retry_interval = '%s'", item->value)));
+
+ if (wal_retrieve_retry_interval <= 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("\"%s\" must have a value strictly positive",
+ "wal_retrieve_retry_interval")));
+ }
else if (strcmp(item->name, "recovery_min_apply_delay") == 0)
{
const char *hintmsg;
@@ -10340,8 +10361,8 @@ static bool
WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
bool fetching_ckpt, XLogRecPtr tliRecPtr)
{
- static pg_time_t last_fail_time = 0;
- pg_time_t now;
+ TimestampTz now = GetCurrentTimestamp();
+ TimestampTz last_fail_time = now;
/*-------
* Standby mode is implemented by a state machine:
@@ -10351,7 +10372,8 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
* 2. Check trigger file
* 3. Read from primary server via walreceiver (XLOG_FROM_STREAM)
* 4. Rescan timelines
- * 5. Sleep 5 seconds, and loop back to 1.
+ * 5. Sleep for the internal of time specified by
+ * wal_retrieve_retry_interval, and loop back to 1.
*
* Failure to read from the current source advances the state machine to
* the next state.
@@ -10490,15 +10512,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
* machine, so we've exhausted all the options for
* obtaining the requested WAL. We're going to loop back
* and retry from the archive, but if it hasn't been long
- * since last attempt, sleep 5 seconds to avoid
- * busy-waiting.
+ * since last attempt, sleep the amount of time specified
+ * by wal_retrieve_retry_interval to avoid busy-waiting.
*/
- now = (pg_time_t) time(NULL);
- if ((now - last_fail_time) < 5)
- {
- pg_usleep(1000000L * (5 - (now - last_fail_time)));
- now = (pg_time_t) time(NULL);
- }
+ now = GetCurrentTimestamp();
+ if (TimestampSleepDifference(last_fail_time, now,
+ wal_retrieve_retry_interval))
+ now = GetCurrentTimestamp();
last_fail_time = now;
currentSource = XLOG_FROM_ARCHIVE;
break;
@@ -10653,13 +10673,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
}
/*
- * Wait for more WAL to arrive. Time out after 5 seconds,
- * like when polling the archive, to react to a trigger
- * file promptly.
+ * Wait for more WAL to arrive. Time out after
+ * wal_retrieve_retry_interval ms, like when polling the archive
+ * to react to a trigger file promptly.
*/
WaitLatch(&XLogCtl->recoveryWakeupLatch,
WL_LATCH_SET | WL_TIMEOUT,
- 5000L);
+ wal_retrieve_retry_interval * 1L);
ResetLatch(&XLogCtl->recoveryWakeupLatch);
break;
}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 67e0cf9..2b11c42 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1674,6 +1674,30 @@ TimestampDifferenceExceeds(TimestampTz start_time,
}
/*
+ * TimestampSleepDifference -- sleep for the amout of interval time
+ * specified, reduced by the difference between two timestamps.
+ * Returns true if sleep is done, false otherwise.
+ *
+ * Both inputs must be ordinary finite timestamps (in current usage,
+ * they'll be results from GetCurrentTimestamp()).
+ */
+bool
+TimestampSleepDifference(TimestampTz start_time,
+ TimestampTz stop_time,
+ int interval_msec)
+{
+ long secs;
+ int microsecs;
+
+ if (TimestampDifferenceExceeds(start_time, stop_time, interval_msec))
+ return false;
+
+ TimestampDifference(start_time, stop_time, &secs, µsecs);
+ pg_usleep(interval_msec * 1000L - (1000000L * secs + 1L * microsecs));
+ return true;
+}
+
+/*
* Convert a time_t to TimestampTz.
*
* We do not use time_t internally in Postgres, but this is provided for use
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index 70118f5..e4a7673 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -217,6 +217,8 @@ extern void TimestampDifference(TimestampTz start_time, TimestampTz stop_time,
extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern bool TimestampSleepDifference(TimestampTz start_time,
+ TimestampTz stop_time, int interval_msec);
/*
* Prototypes for functions to deal with integer timestamps, when the native
--
2.2.2
On Tue, Jan 20, 2015 at 12:57 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
On Tue, Jan 20, 2015 at 1:56 AM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2015-01-19 17:16:11 +0900, Michael Paquier wrote:
On Mon, Jan 19, 2015 at 4:10 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:On Sat, Jan 17, 2015 at 2:44 AM, Andres Freund <andres@2ndquadrant.com> wrote:
Not this patch's fault, but I'm getting a bit tired seeing the above open
coded. How about adding a function that does the sleeping based on a
timestamptz and a ms interval?You mean in plugins, right? I don't recall seeing similar patterns in
other code paths of backend. But I think that we can use something
like that in timestamp.c then because we need to leverage that between
two timestamps, the last failure and now():
TimestampSleepDifference(start_time, stop_time, internal_ms);
Perhaps you have something else in mind?Attached is an updated patch.
Actually I came with better than last patch by using a boolean flag as
return value of TimestampSleepDifference and use
TimestampDifferenceExceeds directly inside it.Subject: [PATCH] Add wal_availability_check_interval to control WAL fetching
on failureI think that name isn't a very good. And its isn't very accurate
either.
How about wal_retrieve_retry_interval? Not very nice, but I think it's
still better than the above.Fine for me.
+ <varlistentry id="wal-availability-check-interval" xreflabel="wal_availability_check_interval"> + <term><varname>wal_availability_check_interval</varname> (<type>integer</type>) + <indexterm> + <primary><varname>wal_availability_check_interval</> recovery parameter</primary> + </indexterm> + </term> + <listitem> + <para> + This parameter specifies the amount of time to wait when + WAL is not available for a node in recovery. Default value is + <literal>5s</>. + </para> + <para> + A node in recovery will wait for this amount of time if + <varname>restore_command</> returns nonzero exit status code when + fetching new WAL segment files from archive or when a WAL receiver + is not able to fetch a WAL record when using streaming replication. + </para> + </listitem> + </varlistentry> + </variablelist>Walreceiver doesn't wait that amount, but rather how long the connection
is intact. And restore_command may or may not retry.I changed this paragraph as follows: + This parameter specifies the amount of time to wait when a failure + occurred when reading WAL from a source (be it via streaming + replication, local <filename>pg_xlog</> or WAL archive) for a node + in standby mode, or when WAL is expected from a source. Default + value is <literal>5s</>. Pondering more about this patch, I am getting the feeling that it is not that much a win to explain in details in the doc each failure depending on the source from which WAL is taken. But it is essential to mention that the wait is done only for a node using standby mode.Btw, I noticed two extra things that I think should be done for consistency: - We should use as well wal_retrieve_retry_interval when waiting for WAL to arrive after checking for the failures: /* - * Wait for more WAL to arrive. Time out after 5 seconds, - * like when polling the archive, to react to a trigger - * file promptly. + * Wait for more WAL to arrive. Time out after + * wal_retrieve_retry_interval ms, like when polling the archive + * to react to a trigger file promptly. */ WaitLatch(&XLogCtl->recoveryWakeupLatch, WL_LATCH_SET | WL_TIMEOUT, - 5000L); + wal_retrieve_retry_interval * 1L); -Updated patch attached.
+ TimestampDifference(start_time, stop_time, &secs, µsecs);
+ pg_usleep(interval_msec * 1000L - (1000000L * secs + 1L * microsecs));
What if the large interval is set and a signal arrives during the sleep?
I'm afraid that a signal cannot terminate the sleep on some platforms.
This problem exists even now because pg_usleep is used, but the sleep
time is just 5 seconds, so it's not so bad. But the patch allows a user to
set large sleep time. Shouldn't we use WaitLatch or split the pg_usleep
like recoveryPausesHere() does?
Regards,
--
Fujii Masao
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Fujii Masao wrote:
+ TimestampDifference(start_time, stop_time, &secs, µsecs); + pg_usleep(interval_msec * 1000L - (1000000L * secs + 1L * microsecs));What if the large interval is set and a signal arrives during the sleep?
I'm afraid that a signal cannot terminate the sleep on some platforms.
This problem exists even now because pg_usleep is used, but the sleep
time is just 5 seconds, so it's not so bad. But the patch allows a user to
set large sleep time.
Yes, and I thought that we could live with that for this patch... Now
that you mention it something similar to what recoveryPausesHere would
be quite good to ease the shutdown of a process interrupted, even more
than now as well. So let's do that.
Shouldn't we use WaitLatch or split the pg_usleep like recoveryPausesHere() does?
I'd vote for the latter as we would still need to calculate a
timestamp difference in any cases, so it feels easier to do that in
the new single API and this patch does (routine useful for plugins as
well). Now I will not fight if people think that using
recoveryWakeupLatch is better.
An updated patch is attached. This patch contains as well a fix for
something that was mentioned upthread but not added in latest version:
wal_availability_check_interval should be used when waiting for a WAL
record from a stream, and for a segment when fetching from archives.
Last version did only the former, not the latter.
--
Michael
Attachments:
20150205_wal_avail_check_v3.patchtext/x-diff; charset=US-ASCII; name=20150205_wal_avail_check_v3.patchDownload
From 9c3a7fa35538993c1345a40fe0973332a76bdb81 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@otacoo.com>
Date: Mon, 19 Jan 2015 16:08:48 +0900
Subject: [PATCH] Add wal_availability_check_interval
This parameter aids to control at which timing WAL availability is checked
when a node is in recovery, particularly when successive failures happen when
fetching WAL archives, or when fetching WAL records from a streaming source.
Default value is 5s.
---
doc/src/sgml/recovery-config.sgml | 21 +++++++++++
src/backend/access/transam/recovery.conf.sample | 10 ++++++
src/backend/access/transam/xlog.c | 47 +++++++++++++++++--------
src/backend/utils/adt/timestamp.c | 38 ++++++++++++++++++++
src/include/utils/timestamp.h | 2 ++
5 files changed, 104 insertions(+), 14 deletions(-)
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index 0c64ff2..7fd51ce 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -145,6 +145,27 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
</listitem>
</varlistentry>
+ <varlistentry id="wal-availability-check-interval" xreflabel="wal_availability_check_interval">
+ <term><varname>wal_availability_check_interval</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>wal_availability_check_interval</> recovery parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ This parameter specifies the amount of time to wait when
+ WAL is not available for a node in recovery. Default value is
+ <literal>5s</>.
+ </para>
+ <para>
+ A node in recovery will wait for this amount of time if
+ <varname>restore_command</> returns nonzero exit status code when
+ fetching new WAL segment files from archive or when a WAL receiver
+ is not able to fetch a WAL record when using streaming replication.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect1>
diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample
index b777400..70d3946 100644
--- a/src/backend/access/transam/recovery.conf.sample
+++ b/src/backend/access/transam/recovery.conf.sample
@@ -58,6 +58,16 @@
#
#recovery_end_command = ''
#
+#
+# wal_availability_check_interval
+#
+# specifies an optional interval to check for availability of WAL when
+# recovering a node. This interval of time represents the frequency of
+# retries if a previous command of restore_command returned nonzero exit
+# status code or if a walreceiver did not stream completely a WAL record.
+#
+#wal_availability_check_interval = '5s'
+#
#---------------------------------------------------------------------------
# RECOVERY TARGET PARAMETERS
#---------------------------------------------------------------------------
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 629a457..4f4efca 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -235,6 +235,7 @@ static TimestampTz recoveryTargetTime;
static char *recoveryTargetName;
static int recovery_min_apply_delay = 0;
static TimestampTz recoveryDelayUntilTime;
+static int wal_availability_check_interval = 5000;
/* options taken from recovery.conf for XLOG streaming */
static bool StandbyModeRequested = false;
@@ -4881,6 +4882,26 @@ readRecoveryCommandFile(void)
(errmsg_internal("trigger_file = '%s'",
TriggerFile)));
}
+ else if (strcmp(item->name, "wal_availability_check_interval") == 0)
+ {
+ const char *hintmsg;
+
+ if (!parse_int(item->value, &wal_availability_check_interval, GUC_UNIT_MS,
+ &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("parameter \"%s\" requires a temporal value",
+ "wal_availability_check_interval"),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+ ereport(DEBUG2,
+ (errmsg_internal("wal_availability_check_interval = '%s'", item->value)));
+
+ if (wal_availability_check_interval <= 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("\"%s\" must have a value strictly positive",
+ "wal_availability_check_interval")));
+ }
else if (strcmp(item->name, "recovery_min_apply_delay") == 0)
{
const char *hintmsg;
@@ -10340,8 +10361,8 @@ static bool
WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
bool fetching_ckpt, XLogRecPtr tliRecPtr)
{
- static pg_time_t last_fail_time = 0;
- pg_time_t now;
+ TimestampTz now = GetCurrentTimestamp();
+ TimestampTz last_fail_time = now;
/*-------
* Standby mode is implemented by a state machine:
@@ -10490,15 +10511,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
* machine, so we've exhausted all the options for
* obtaining the requested WAL. We're going to loop back
* and retry from the archive, but if it hasn't been long
- * since last attempt, sleep 5 seconds to avoid
- * busy-waiting.
+ * since last attempt, sleep the amount of time specified
+ * by wal_availability_check_interval to avoid busy-waiting.
*/
- now = (pg_time_t) time(NULL);
- if ((now - last_fail_time) < 5)
- {
- pg_usleep(1000000L * (5 - (now - last_fail_time)));
- now = (pg_time_t) time(NULL);
- }
+ now = GetCurrentTimestamp();
+ if (TimestampSleepDifference(last_fail_time, now,
+ wal_availability_check_interval))
+ now = GetCurrentTimestamp();
last_fail_time = now;
currentSource = XLOG_FROM_ARCHIVE;
break;
@@ -10653,13 +10672,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
}
/*
- * Wait for more WAL to arrive. Time out after 5 seconds,
- * like when polling the archive, to react to a trigger
- * file promptly.
+ * Wait for more WAL to arrive. Time out after the amount of
+ * time specified by wal_availability_check_interval, like
+ * when polling the archive, to react to a trigger file promptly.
*/
WaitLatch(&XLogCtl->recoveryWakeupLatch,
WL_LATCH_SET | WL_TIMEOUT,
- 5000L);
+ wal_availability_check_interval * 1000L);
ResetLatch(&XLogCtl->recoveryWakeupLatch);
break;
}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 67e0cf9..a4cf717 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1674,6 +1674,44 @@ TimestampDifferenceExceeds(TimestampTz start_time,
}
/*
+ * TimestampSleepDifference -- sleep for the amout of interval time
+ * specified, reduced by the difference between two timestamps.
+ * Returns true if sleep is done, false otherwise.
+ *
+ * Both inputs must be ordinary finite timestamps (in current usage,
+ * they'll be results from GetCurrentTimestamp()). Sleep is done
+ * per steps of 1s to be able to handle process interruptions without
+ * having to wait for a too long time.
+ */
+bool
+TimestampSleepDifference(TimestampTz start_time,
+ TimestampTz stop_time,
+ int interval_msec)
+{
+ long secs, total_time;
+ int microsecs;
+
+ if (TimestampDifferenceExceeds(start_time, stop_time, interval_msec))
+ return false;
+
+ TimestampDifference(start_time, stop_time, &secs, µsecs);
+ total_time = interval_msec * 1000L - (1000000L * secs + 1L * microsecs);
+
+ while (total_time > 0)
+ {
+ long wait_time = 1000000L; /* 1s */
+
+ if (total_time < wait_time)
+ wait_time = total_time;
+
+ pg_usleep(wait_time);
+ HandleStartupProcInterrupts();
+ total_time -= wait_time;
+ }
+ return true;
+}
+
+/*
* Convert a time_t to TimestampTz.
*
* We do not use time_t internally in Postgres, but this is provided for use
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index 70118f5..e4a7673 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -217,6 +217,8 @@ extern void TimestampDifference(TimestampTz start_time, TimestampTz stop_time,
extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern bool TimestampSleepDifference(TimestampTz start_time,
+ TimestampTz stop_time, int interval_msec);
/*
* Prototypes for functions to deal with integer timestamps, when the native
--
2.2.2
On Thu, Feb 5, 2015 at 11:58 PM, Michael Paquier wrote:
An updated patch is attached.
I just noticed that the patch I sent was incorrect:
- Parameter name was still wal_availability_check_interval and not
wal_retrieve_retry_interval
- Documentation was incorrect.
Please use the patch attached instead for further review.
--
Michael
Attachments:
20150206_wal_source_check_v5.patchtext/x-diff; charset=US-ASCII; name=20150206_wal_source_check_v5.patchDownload
From 06a4d3d1f5fe4362d7be1404cbf0b45b74fea69f Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@otacoo.com>
Date: Mon, 19 Jan 2015 16:08:48 +0900
Subject: [PATCH] Add wal_retrieve_retry_interval
This parameter aids to control at which timing WAL availability is checked
when a node is in recovery, particularly when successive failures happen when
fetching WAL archives, or when fetching WAL records from a streaming source.
Default value is 5s.
---
doc/src/sgml/recovery-config.sgml | 17 +++++++++
src/backend/access/transam/recovery.conf.sample | 9 +++++
src/backend/access/transam/xlog.c | 47 +++++++++++++++++--------
src/backend/utils/adt/timestamp.c | 38 ++++++++++++++++++++
src/include/utils/timestamp.h | 2 ++
5 files changed, 99 insertions(+), 14 deletions(-)
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index 0c64ff2..d4babbd 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -145,6 +145,23 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
</listitem>
</varlistentry>
+ <varlistentry id="wal-retrieve-retry-interval" xreflabel="wal_retrieve_retry_interval">
+ <term><varname>wal_retrieve_retry_interval</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>wal_retrieve_retry_interval</> recovery parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ This parameter specifies the amount of time to wait when a failure
+ occurred when reading WAL from a source (be it via streaming
+ replication, local <filename>pg_xlog</> or WAL archive) for a node
+ in standby mode, or when WAL is expected from a source. Default
+ value is <literal>5s</>.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect1>
diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample
index b777400..458308c 100644
--- a/src/backend/access/transam/recovery.conf.sample
+++ b/src/backend/access/transam/recovery.conf.sample
@@ -58,6 +58,15 @@
#
#recovery_end_command = ''
#
+#
+# wal_retrieve_retry_interval
+#
+# specifies an optional internal to wait for WAL to become available when
+# a failure occurred when reading WAL from a source for a node in standby
+# mode, or when WAL is expected from a source.
+#
+#wal_retrieve_retry_interval = '5s'
+#
#---------------------------------------------------------------------------
# RECOVERY TARGET PARAMETERS
#---------------------------------------------------------------------------
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 629a457..111e53d 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -235,6 +235,7 @@ static TimestampTz recoveryTargetTime;
static char *recoveryTargetName;
static int recovery_min_apply_delay = 0;
static TimestampTz recoveryDelayUntilTime;
+static int wal_retrieve_retry_interval = 5000;
/* options taken from recovery.conf for XLOG streaming */
static bool StandbyModeRequested = false;
@@ -4881,6 +4882,26 @@ readRecoveryCommandFile(void)
(errmsg_internal("trigger_file = '%s'",
TriggerFile)));
}
+ else if (strcmp(item->name, "wal_retrieve_retry_interval") == 0)
+ {
+ const char *hintmsg;
+
+ if (!parse_int(item->value, &wal_retrieve_retry_interval, GUC_UNIT_MS,
+ &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("parameter \"%s\" requires a temporal value",
+ "wal_retrieve_retry_interval"),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+ ereport(DEBUG2,
+ (errmsg_internal("wal_retrieve_retry_interval = '%s'", item->value)));
+
+ if (wal_retrieve_retry_interval <= 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("\"%s\" must have a value strictly positive",
+ "wal_retrieve_retry_interval")));
+ }
else if (strcmp(item->name, "recovery_min_apply_delay") == 0)
{
const char *hintmsg;
@@ -10340,8 +10361,8 @@ static bool
WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
bool fetching_ckpt, XLogRecPtr tliRecPtr)
{
- static pg_time_t last_fail_time = 0;
- pg_time_t now;
+ TimestampTz now = GetCurrentTimestamp();
+ TimestampTz last_fail_time = now;
/*-------
* Standby mode is implemented by a state machine:
@@ -10490,15 +10511,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
* machine, so we've exhausted all the options for
* obtaining the requested WAL. We're going to loop back
* and retry from the archive, but if it hasn't been long
- * since last attempt, sleep 5 seconds to avoid
- * busy-waiting.
+ * since last attempt, sleep the amount of time specified
+ * by wal_retrieve_retry_interval to avoid busy-waiting.
*/
- now = (pg_time_t) time(NULL);
- if ((now - last_fail_time) < 5)
- {
- pg_usleep(1000000L * (5 - (now - last_fail_time)));
- now = (pg_time_t) time(NULL);
- }
+ now = GetCurrentTimestamp();
+ if (TimestampSleepDifference(last_fail_time, now,
+ wal_retrieve_retry_interval))
+ now = GetCurrentTimestamp();
last_fail_time = now;
currentSource = XLOG_FROM_ARCHIVE;
break;
@@ -10653,13 +10672,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
}
/*
- * Wait for more WAL to arrive. Time out after 5 seconds,
- * like when polling the archive, to react to a trigger
- * file promptly.
+ * Wait for more WAL to arrive. Time out after the amount of
+ * time specified by wal_retrieve_retry_interval, like
+ * when polling the archive, to react to a trigger file promptly.
*/
WaitLatch(&XLogCtl->recoveryWakeupLatch,
WL_LATCH_SET | WL_TIMEOUT,
- 5000L);
+ wal_retrieve_retry_interval * 1000L);
ResetLatch(&XLogCtl->recoveryWakeupLatch);
break;
}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 67e0cf9..a4cf717 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1674,6 +1674,44 @@ TimestampDifferenceExceeds(TimestampTz start_time,
}
/*
+ * TimestampSleepDifference -- sleep for the amout of interval time
+ * specified, reduced by the difference between two timestamps.
+ * Returns true if sleep is done, false otherwise.
+ *
+ * Both inputs must be ordinary finite timestamps (in current usage,
+ * they'll be results from GetCurrentTimestamp()). Sleep is done
+ * per steps of 1s to be able to handle process interruptions without
+ * having to wait for a too long time.
+ */
+bool
+TimestampSleepDifference(TimestampTz start_time,
+ TimestampTz stop_time,
+ int interval_msec)
+{
+ long secs, total_time;
+ int microsecs;
+
+ if (TimestampDifferenceExceeds(start_time, stop_time, interval_msec))
+ return false;
+
+ TimestampDifference(start_time, stop_time, &secs, µsecs);
+ total_time = interval_msec * 1000L - (1000000L * secs + 1L * microsecs);
+
+ while (total_time > 0)
+ {
+ long wait_time = 1000000L; /* 1s */
+
+ if (total_time < wait_time)
+ wait_time = total_time;
+
+ pg_usleep(wait_time);
+ HandleStartupProcInterrupts();
+ total_time -= wait_time;
+ }
+ return true;
+}
+
+/*
* Convert a time_t to TimestampTz.
*
* We do not use time_t internally in Postgres, but this is provided for use
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index 70118f5..e4a7673 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -217,6 +217,8 @@ extern void TimestampDifference(TimestampTz start_time, TimestampTz stop_time,
extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern bool TimestampSleepDifference(TimestampTz start_time,
+ TimestampTz stop_time, int interval_msec);
/*
* Prototypes for functions to deal with integer timestamps, when the native
--
2.2.2
On Fri, Feb 6, 2015 at 1:23 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
On Thu, Feb 5, 2015 at 11:58 PM, Michael Paquier wrote:
An updated patch is attached.
I just noticed that the patch I sent was incorrect:
- Parameter name was still wal_availability_check_interval and not
wal_retrieve_retry_interval
- Documentation was incorrect.
Please use the patch attached instead for further review.
Thanks for updating the patch!
timestamp.c:1708: warning: implicit declaration of function
'HandleStartupProcInterrupts'
I got the above warning at the compilation.
+ pg_usleep(wait_time);
+ HandleStartupProcInterrupts();
+ total_time -= wait_time;
It seems strange to call the startup-process-specific function in
the "generic" function like TimestampSleepDifference().
* 5. Sleep 5 seconds, and loop back to 1.
In WaitForWALToBecomeAvailable(), the above comment should
be updated.
- * Wait for more WAL to arrive. Time out after 5 seconds,
- * like when polling the archive, to react to a trigger
- * file promptly.
+ * Wait for more WAL to arrive. Time out after
the amount of
+ * time specified by wal_retrieve_retry_interval, like
+ * when polling the archive, to react to a
trigger file promptly.
*/
WaitLatch(&XLogCtl->recoveryWakeupLatch,
WL_LATCH_SET | WL_TIMEOUT,
- 5000L);
+ wal_retrieve_retry_interval * 1000L);
This change can prevent the startup process from reacting to
a trigger file. Imagine the case where the large interval is set
and the user want to promote the standby by using the trigger file
instead of pg_ctl promote. I think that the sleep time should be 5s
if the interval is set to more than 5s. Thought?
+# specifies an optional internal to wait for WAL to become available when
s/internal/interval?
+ This parameter specifies the amount of time to wait when a failure
+ occurred when reading WAL from a source (be it via streaming
+ replication, local <filename>pg_xlog</> or WAL archive) for a node
+ in standby mode, or when WAL is expected from a source.
At least to me, it seems not easy to understand what the parameter is
from this description. What about the following, for example?
This parameter specifies the amount of time to wait when WAL is not
available from any sources (streaming replication, local
<filename>pg_xlog</> or WAL archive) before retrying to retrieve WAL....
Isn't it better to place this parameter in postgresql.conf rather than
recovery.conf? I'd like to change the value of this parameter without
restarting the server.
Regards,
--
Fujii Masao
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Feb 6, 2015 at 4:58 PM, Fujii Masao wrote:
timestamp.c:1708: warning: implicit declaration of function
'HandleStartupProcInterrupts'I got the above warning at the compilation.
+ pg_usleep(wait_time); + HandleStartupProcInterrupts(); + total_time -= wait_time;It seems strange to call the startup-process-specific function in
the "generic" function like TimestampSleepDifference().
I changed the sleep to a WaitLatch and moved all the logic back to
xlog.c, removing at the same the stuff in timestamp.c because it seems
strange to add a dependency with even latch there.
* 5. Sleep 5 seconds, and loop back to 1.
In WaitForWALToBecomeAvailable(), the above comment should
be updated.
Done.
- * Wait for more WAL to arrive. Time out after 5 seconds, - * like when polling the archive, to react to a trigger - * file promptly. + * Wait for more WAL to arrive. Time out after the amount of + * time specified by wal_retrieve_retry_interval, like + * when polling the archive, to react to a trigger file promptly. */ WaitLatch(&XLogCtl->recoveryWakeupLatch, WL_LATCH_SET | WL_TIMEOUT, - 5000L); + wal_retrieve_retry_interval * 1000L);This change can prevent the startup process from reacting to
a trigger file. Imagine the case where the large interval is set
and the user want to promote the standby by using the trigger file
instead of pg_ctl promote. I think that the sleep time should be 5s
if the interval is set to more than 5s. Thought?
I disagree here. It is interesting to accelerate the check of WAL
availability from a source in some cases for replication, but the
opposite is true as well as mentioned by Alexey at the beginning of
the thread to reduce the number of requests when requesting WAL
archives from an external storage type AWS. Hence a correct solution
would be to check periodically for the trigger file with a maximum
one-time wait of 5s to ensure backward-compatible behavior. We could
reduce it to 1s or something like that as well.
+# specifies an optional internal to wait for WAL to become available when
s/internal/interval?
This part has been removed in the new part as parameter is set as a GUC.
+ This parameter specifies the amount of time to wait when a failure + occurred when reading WAL from a source (be it via streaming + replication, local <filename>pg_xlog</> or WAL archive) for a node + in standby mode, or when WAL is expected from a source.At least to me, it seems not easy to understand what the parameter is
from this description. What about the following, for example?
This parameter specifies the amount of time to wait when WAL is not
available from any sources (streaming replication, local
<filename>pg_xlog</> or WAL archive) before retrying to retrieve WAL....
OK, done.
Isn't it better to place this parameter in postgresql.conf rather than
recovery.conf? I'd like to change the value of this parameter without
restarting the server.
Yes, agreed. Done so with a SIGHUP guc.
Regards,
--
Michael
Attachments:
20150206_wal_source_check_v6.patchtext/x-diff; charset=US-ASCII; name=20150206_wal_source_check_v6.patchDownload
From 594db45eda43c0abc13ad0dbca81d6ed3f4650e7 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@otacoo.com>
Date: Mon, 19 Jan 2015 16:08:48 +0900
Subject: [PATCH] Add wal_retrieve_retry_interval
This parameter aids to control at which timing WAL availability is checked
when a node is in recovery, particularly when successive failures happen when
fetching WAL archives, or when fetching WAL records from a streaming source.
Default value is 5s.
---
doc/src/sgml/config.sgml | 17 +++++
src/backend/access/transam/xlog.c | 91 ++++++++++++++++++---------
src/backend/utils/misc/guc.c | 12 ++++
src/backend/utils/misc/postgresql.conf.sample | 3 +
src/include/access/xlog.h | 1 +
5 files changed, 93 insertions(+), 31 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 6bcb106..d82b26a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2985,6 +2985,23 @@ include_dir 'conf.d'
</listitem>
</varlistentry>
+ <varlistentry id="guc-wal-retrieve-retry-interval" xreflabel="wal_retrieve_retry_interval">
+ <term><varname>wal_retrieve_retry_interval</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>wal_retrieve_retry_interval</> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Specify the amount of time to wait when WAL is not available from
+ any sources (streaming replication, local <filename>pg_xlog</> or
+ WAL archive) before retrying to retrieve WAL. This parameter can
+ only be set in the <filename>postgresql.conf</> file or on the
+ server command line. The default value is 5 seconds.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
</sect1>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 629a457..769fbf3 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -93,6 +93,7 @@ int sync_method = DEFAULT_SYNC_METHOD;
int wal_level = WAL_LEVEL_MINIMAL;
int CommitDelay = 0; /* precommit delay in microseconds */
int CommitSiblings = 5; /* # concurrent xacts needed to sleep */
+int wal_retrieve_retry_interval = 5000;
#ifdef WAL_DEBUG
bool XLOG_DEBUG = false;
@@ -10340,8 +10341,8 @@ static bool
WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
bool fetching_ckpt, XLogRecPtr tliRecPtr)
{
- static pg_time_t last_fail_time = 0;
- pg_time_t now;
+ TimestampTz now = GetCurrentTimestamp();
+ TimestampTz last_fail_time = now;
/*-------
* Standby mode is implemented by a state machine:
@@ -10351,7 +10352,9 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
* 2. Check trigger file
* 3. Read from primary server via walreceiver (XLOG_FROM_STREAM)
* 4. Rescan timelines
- * 5. Sleep 5 seconds, and loop back to 1.
+ * 5. Sleep the amount of time defined by wal_retrieve_retry_interval
+ * while checking periodically for the presence of user-defined
+ * trigger file and loop back to 1.
*
* Failure to read from the current source advances the state machine to
* the next state.
@@ -10490,14 +10493,27 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
* machine, so we've exhausted all the options for
* obtaining the requested WAL. We're going to loop back
* and retry from the archive, but if it hasn't been long
- * since last attempt, sleep 5 seconds to avoid
- * busy-waiting.
+ * since last attempt, sleep the amount of time specified
+ * by wal_retrieve_retry_interval to avoid busy-waiting.
*/
- now = (pg_time_t) time(NULL);
- if ((now - last_fail_time) < 5)
+ now = GetCurrentTimestamp();
+ if (!TimestampDifferenceExceeds(last_fail_time, now,
+ wal_retrieve_retry_interval))
{
- pg_usleep(1000000L * (5 - (now - last_fail_time)));
- now = (pg_time_t) time(NULL);
+ long secs, wait_time;
+ int microsecs;
+ TimestampDifference(last_fail_time, now, &secs, µsecs);
+
+ wait_time = wal_retrieve_retry_interval * 1000L -
+ (1000000L * secs + 1L * microsecs);
+ if (wait_time > 5000000L) /* 5s */
+ wait_time = 5000000L;
+
+ WaitLatch(&XLogCtl->recoveryWakeupLatch,
+ WL_LATCH_SET | WL_TIMEOUT,
+ wait_time / 1000);
+ ResetLatch(&XLogCtl->recoveryWakeupLatch);
+ now = GetCurrentTimestamp();
}
last_fail_time = now;
currentSource = XLOG_FROM_ARCHIVE;
@@ -10562,6 +10578,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
case XLOG_FROM_STREAM:
{
bool havedata;
+ long remaining_wait_time;
/*
* Check if WAL receiver is still active.
@@ -10634,33 +10651,45 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
}
/*
- * Data not here yet. Check for trigger, then wait for
- * walreceiver to wake us up when new WAL arrives.
+ * Wait for more WAL for a total of time defined by
+ * wal_retrieve_retry_interval and check the presence of a
+ * user-defined trigger file every 5s to ensure quick
+ * responsiveness of system.
*/
- if (CheckForStandbyTrigger())
+ remaining_wait_time = wal_retrieve_retry_interval * 1L;
+ while (remaining_wait_time > 0)
{
+ long wait_time = 5000L; /* 5s */
+
+ if (remaining_wait_time < wait_time)
+ wait_time = remaining_wait_time;
+
/*
- * Note that we don't "return false" immediately here.
- * After being triggered, we still want to replay all
- * the WAL that was already streamed. It's in pg_xlog
- * now, so we just treat this as a failure, and the
- * state machine will move on to replay the streamed
- * WAL from pg_xlog, and then recheck the trigger and
- * exit replay.
+ * Data not here yet. Check for trigger, then wait for
+ * walreceiver to wake us up when new WAL arrives.
*/
- lastSourceFailed = true;
- break;
- }
+ if (CheckForStandbyTrigger())
+ {
+ /*
+ * Note that we don't "return false" immediately here.
+ * After being triggered, we still want to replay all
+ * the WAL that was already streamed. It's in pg_xlog
+ * now, so we just treat this as a failure, and the
+ * state machine will move on to replay the streamed
+ * WAL from pg_xlog, and then recheck the trigger and
+ * exit replay.
+ */
+ lastSourceFailed = true;
+ break;
+ }
- /*
- * Wait for more WAL to arrive. Time out after 5 seconds,
- * like when polling the archive, to react to a trigger
- * file promptly.
- */
- WaitLatch(&XLogCtl->recoveryWakeupLatch,
- WL_LATCH_SET | WL_TIMEOUT,
- 5000L);
- ResetLatch(&XLogCtl->recoveryWakeupLatch);
+ /* wait a bit */
+ WaitLatch(&XLogCtl->recoveryWakeupLatch,
+ WL_LATCH_SET | WL_TIMEOUT,
+ wait_time);
+ ResetLatch(&XLogCtl->recoveryWakeupLatch);
+ remaining_wait_time -= wait_time;
+ }
break;
}
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 9572777..c4598ed 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2364,6 +2364,18 @@ static struct config_int ConfigureNamesInt[] =
},
{
+ {"wal_retrieve_retry_interval", PGC_SIGHUP, WAL_SETTINGS,
+ gettext_noop("Specifies the amount of time to wait when WAL is not "
+ "available from a source."),
+ NULL,
+ GUC_UNIT_MS
+ },
+ &wal_retrieve_retry_interval,
+ 5000, 1, INT_MAX,
+ NULL, NULL, NULL
+ },
+
+ {
{"wal_segment_size", PGC_INTERNAL, PRESET_OPTIONS,
gettext_noop("Shows the number of pages per write ahead log segment."),
NULL,
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index b053659..73e2bca 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -260,6 +260,9 @@
#wal_receiver_timeout = 60s # time that receiver waits for
# communication from master
# in milliseconds; 0 disables
+#wal_retrieve_retry_interval = 5s # time to wait before retrying to
+ # retrieve WAL from a source (streaming
+ # replication, archive or local pg_xlog)
#------------------------------------------------------------------------------
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 138deaf..be27a85 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -93,6 +93,7 @@ extern int CheckPointSegments;
extern int wal_keep_segments;
extern int XLOGbuffers;
extern int XLogArchiveTimeout;
+extern int wal_retrieve_retry_interval;
extern bool XLogArchiveMode;
extern char *XLogArchiveCommand;
extern bool EnableHotStandby;
--
2.3.0
On Sun, Feb 8, 2015 at 2:54 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
On Fri, Feb 6, 2015 at 4:58 PM, Fujii Masao wrote:
- * Wait for more WAL to arrive. Time out after 5 seconds, - * like when polling the archive, to react to a trigger - * file promptly. + * Wait for more WAL to arrive. Time out after the amount of + * time specified by wal_retrieve_retry_interval, like + * when polling the archive, to react to a trigger file promptly. */ WaitLatch(&XLogCtl->recoveryWakeupLatch, WL_LATCH_SET | WL_TIMEOUT, - 5000L); + wal_retrieve_retry_interval * 1000L);This change can prevent the startup process from reacting to
a trigger file. Imagine the case where the large interval is set
and the user want to promote the standby by using the trigger file
instead of pg_ctl promote. I think that the sleep time should be 5s
if the interval is set to more than 5s. Thought?I disagree here. It is interesting to accelerate the check of WAL
availability from a source in some cases for replication, but the
opposite is true as well as mentioned by Alexey at the beginning of
the thread to reduce the number of requests when requesting WAL
archives from an external storage type AWS. Hence a correct solution
would be to check periodically for the trigger file with a maximum
one-time wait of 5s to ensure backward-compatible behavior. We could
reduce it to 1s or something like that as well.
You seem to have misunderstood the code in question. Or I'm missing something.
The timeout of the WaitLatch is just the interval to check for the trigger file
while waiting for more WAL to arrive from streaming replication. Not related to
the retry time to restore WAL from the archive.
Regards,
--
Fujii Masao
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Feb 9, 2015 at 8:29 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Sun, Feb 8, 2015 at 2:54 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:On Fri, Feb 6, 2015 at 4:58 PM, Fujii Masao wrote:
- * Wait for more WAL to arrive. Time out after 5 seconds, - * like when polling the archive, to react to a trigger - * file promptly. + * Wait for more WAL to arrive. Time out after the amount of + * time specified by wal_retrieve_retry_interval, like + * when polling the archive, to react to a trigger file promptly. */ WaitLatch(&XLogCtl->recoveryWakeupLatch, WL_LATCH_SET | WL_TIMEOUT, - 5000L); + wal_retrieve_retry_interval * 1000L);This change can prevent the startup process from reacting to
a trigger file. Imagine the case where the large interval is set
and the user want to promote the standby by using the trigger file
instead of pg_ctl promote. I think that the sleep time should be 5s
if the interval is set to more than 5s. Thought?I disagree here. It is interesting to accelerate the check of WAL
availability from a source in some cases for replication, but the
opposite is true as well as mentioned by Alexey at the beginning of
the thread to reduce the number of requests when requesting WAL
archives from an external storage type AWS. Hence a correct solution
would be to check periodically for the trigger file with a maximum
one-time wait of 5s to ensure backward-compatible behavior. We could
reduce it to 1s or something like that as well.You seem to have misunderstood the code in question. Or I'm missing something.
The timeout of the WaitLatch is just the interval to check for the trigger file
while waiting for more WAL to arrive from streaming replication. Not related to
the retry time to restore WAL from the archive.
[Re-reading the code...]
Aah.. Yes you are right. Sorry for the noise. Yes let's wait for a
maximum of 5s then. I also noticed in previous patch that the wait was
maximized to 5s. To begin with, a loop should have been used if it was
a sleep, but as now patch uses a latch this limit does not make much
sense... Patch updated is attached.
Regards,
--
Michael
Attachments:
20150210_wal_source_check_v7.patchtext/x-patch; charset=US-ASCII; name=20150210_wal_source_check_v7.patchDownload
From 9b7e3bb32c744b328a0d99db3040cadfcba606aa Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@otacoo.com>
Date: Mon, 19 Jan 2015 16:08:48 +0900
Subject: [PATCH] Add wal_retrieve_retry_interval
This parameter aids to control at which timing WAL availability is checked
when a node is in recovery, particularly when successive failures happen when
fetching WAL archives, or when fetching WAL records from a streaming source.
Default value is 5s.
---
doc/src/sgml/config.sgml | 17 ++++++++++
src/backend/access/transam/xlog.c | 46 +++++++++++++++++++--------
src/backend/utils/misc/guc.c | 12 +++++++
src/backend/utils/misc/postgresql.conf.sample | 3 ++
src/include/access/xlog.h | 1 +
5 files changed, 66 insertions(+), 13 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 6bcb106..d82b26a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2985,6 +2985,23 @@ include_dir 'conf.d'
</listitem>
</varlistentry>
+ <varlistentry id="guc-wal-retrieve-retry-interval" xreflabel="wal_retrieve_retry_interval">
+ <term><varname>wal_retrieve_retry_interval</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>wal_retrieve_retry_interval</> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Specify the amount of time to wait when WAL is not available from
+ any sources (streaming replication, local <filename>pg_xlog</> or
+ WAL archive) before retrying to retrieve WAL. This parameter can
+ only be set in the <filename>postgresql.conf</> file or on the
+ server command line. The default value is 5 seconds.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
</sect1>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 629a457..1f9c3c4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -93,6 +93,7 @@ int sync_method = DEFAULT_SYNC_METHOD;
int wal_level = WAL_LEVEL_MINIMAL;
int CommitDelay = 0; /* precommit delay in microseconds */
int CommitSiblings = 5; /* # concurrent xacts needed to sleep */
+int wal_retrieve_retry_interval = 5000;
#ifdef WAL_DEBUG
bool XLOG_DEBUG = false;
@@ -10340,8 +10341,8 @@ static bool
WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
bool fetching_ckpt, XLogRecPtr tliRecPtr)
{
- static pg_time_t last_fail_time = 0;
- pg_time_t now;
+ TimestampTz now = GetCurrentTimestamp();
+ TimestampTz last_fail_time = now;
/*-------
* Standby mode is implemented by a state machine:
@@ -10351,7 +10352,9 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
* 2. Check trigger file
* 3. Read from primary server via walreceiver (XLOG_FROM_STREAM)
* 4. Rescan timelines
- * 5. Sleep 5 seconds, and loop back to 1.
+ * 5. Sleep the amount of time defined by wal_retrieve_retry_interval
+ * while checking periodically for the presence of user-defined
+ * trigger file and loop back to 1.
*
* Failure to read from the current source advances the state machine to
* the next state.
@@ -10490,14 +10493,25 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
* machine, so we've exhausted all the options for
* obtaining the requested WAL. We're going to loop back
* and retry from the archive, but if it hasn't been long
- * since last attempt, sleep 5 seconds to avoid
- * busy-waiting.
+ * since last attempt, sleep the amount of time specified
+ * by wal_retrieve_retry_interval.
*/
- now = (pg_time_t) time(NULL);
- if ((now - last_fail_time) < 5)
+ now = GetCurrentTimestamp();
+ if (!TimestampDifferenceExceeds(last_fail_time, now,
+ wal_retrieve_retry_interval))
{
- pg_usleep(1000000L * (5 - (now - last_fail_time)));
- now = (pg_time_t) time(NULL);
+ long secs, wait_time;
+ int microsecs;
+ TimestampDifference(last_fail_time, now, &secs, µsecs);
+
+ wait_time = wal_retrieve_retry_interval * 1000L -
+ (1000000L * secs + 1L * microsecs);
+
+ WaitLatch(&XLogCtl->recoveryWakeupLatch,
+ WL_LATCH_SET | WL_TIMEOUT,
+ wait_time / 1000);
+ ResetLatch(&XLogCtl->recoveryWakeupLatch);
+ now = GetCurrentTimestamp();
}
last_fail_time = now;
currentSource = XLOG_FROM_ARCHIVE;
@@ -10562,6 +10576,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
case XLOG_FROM_STREAM:
{
bool havedata;
+ long wait_time = wal_retrieve_retry_interval;
/*
* Check if WAL receiver is still active.
@@ -10653,13 +10668,18 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
}
/*
- * Wait for more WAL to arrive. Time out after 5 seconds,
- * like when polling the archive, to react to a trigger
- * file promptly.
+ * Wait for more WAL to arrive. Time out after the amount
+ * of time defined by wal_retrieve_retry_interval like when
+ * polling the archive for no more than 5s to ensure quick
+ * responsiveness of system.
*/
+ if (wal_retrieve_retry_interval >= 5000)
+ wait_time = 5000;
+
+ /* wait a bit */
WaitLatch(&XLogCtl->recoveryWakeupLatch,
WL_LATCH_SET | WL_TIMEOUT,
- 5000L);
+ wait_time * 1L);
ResetLatch(&XLogCtl->recoveryWakeupLatch);
break;
}
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 9572777..c4598ed 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2364,6 +2364,18 @@ static struct config_int ConfigureNamesInt[] =
},
{
+ {"wal_retrieve_retry_interval", PGC_SIGHUP, WAL_SETTINGS,
+ gettext_noop("Specifies the amount of time to wait when WAL is not "
+ "available from a source."),
+ NULL,
+ GUC_UNIT_MS
+ },
+ &wal_retrieve_retry_interval,
+ 5000, 1, INT_MAX,
+ NULL, NULL, NULL
+ },
+
+ {
{"wal_segment_size", PGC_INTERNAL, PRESET_OPTIONS,
gettext_noop("Shows the number of pages per write ahead log segment."),
NULL,
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index b053659..73e2bca 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -260,6 +260,9 @@
#wal_receiver_timeout = 60s # time that receiver waits for
# communication from master
# in milliseconds; 0 disables
+#wal_retrieve_retry_interval = 5s # time to wait before retrying to
+ # retrieve WAL from a source (streaming
+ # replication, archive or local pg_xlog)
#------------------------------------------------------------------------------
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 138deaf..be27a85 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -93,6 +93,7 @@ extern int CheckPointSegments;
extern int wal_keep_segments;
extern int XLOGbuffers;
extern int XLogArchiveTimeout;
+extern int wal_retrieve_retry_interval;
extern bool XLogArchiveMode;
extern char *XLogArchiveCommand;
extern bool EnableHotStandby;
--
2.3.0
On Tue, Feb 10, 2015 at 10:32 PM, Michael Paquier <michael.paquier@gmail.com
wrote:
Patch updated is attached.
Patch moved to CF 2015-02 with same status.
--
Michael
On Tue, Feb 10, 2015 at 10:32 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
On Mon, Feb 9, 2015 at 8:29 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Sun, Feb 8, 2015 at 2:54 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:On Fri, Feb 6, 2015 at 4:58 PM, Fujii Masao wrote:
- * Wait for more WAL to arrive. Time out after 5 seconds, - * like when polling the archive, to react to a trigger - * file promptly. + * Wait for more WAL to arrive. Time out after the amount of + * time specified by wal_retrieve_retry_interval, like + * when polling the archive, to react to a trigger file promptly. */ WaitLatch(&XLogCtl->recoveryWakeupLatch, WL_LATCH_SET | WL_TIMEOUT, - 5000L); + wal_retrieve_retry_interval * 1000L);This change can prevent the startup process from reacting to
a trigger file. Imagine the case where the large interval is set
and the user want to promote the standby by using the trigger file
instead of pg_ctl promote. I think that the sleep time should be 5s
if the interval is set to more than 5s. Thought?I disagree here. It is interesting to accelerate the check of WAL
availability from a source in some cases for replication, but the
opposite is true as well as mentioned by Alexey at the beginning of
the thread to reduce the number of requests when requesting WAL
archives from an external storage type AWS. Hence a correct solution
would be to check periodically for the trigger file with a maximum
one-time wait of 5s to ensure backward-compatible behavior. We could
reduce it to 1s or something like that as well.You seem to have misunderstood the code in question. Or I'm missing something.
The timeout of the WaitLatch is just the interval to check for the trigger file
while waiting for more WAL to arrive from streaming replication. Not related to
the retry time to restore WAL from the archive.[Re-reading the code...]
Aah.. Yes you are right. Sorry for the noise. Yes let's wait for a
maximum of 5s then. I also noticed in previous patch that the wait was
maximized to 5s. To begin with, a loop should have been used if it was
a sleep, but as now patch uses a latch this limit does not make much
sense... Patch updated is attached.
On second thought, the interval to check the trigger file is very different
from the wait time to retry to retrieve WAL. So it seems strange and even
confusing to control them by one parameter. If we really want to make
the interval for the trigger file configurable, we should invent new GUC for it.
But I don't think that it's worth doing that. If someone wants to react the
trigger file more promptly for "fast" promotion, he or she basically can use
pg_ctl promote command, instead. Thought?
Attached is the updated version of the patch. I changed the parameter so that
it doesn't affect the interval of checking the trigger file.
- static pg_time_t last_fail_time = 0;
- pg_time_t now;
+ TimestampTz now = GetCurrentTimestamp();
+ TimestampTz last_fail_time = now;
I reverted the code here as it was. I don't think GetCurrentTimestamp() needs
to be called for each WaitForWALToBecomeAvailable().
+ WaitLatch(&XLogCtl->recoveryWakeupLatch,
+ WL_LATCH_SET | WL_TIMEOUT,
+ wait_time / 1000);
Don't we need to specify WL_POSTMASTER_DEATH flag here? Added.
+ {"wal_retrieve_retry_interval", PGC_SIGHUP, WAL_SETTINGS,
WAL_SETTINGS should be REPLICATION_STANDBY? Changed.
Regards,
--
Fujii Masao
Attachments:
20150220_wal_source_check_v8.patchtext/x-patch; charset=US-ASCII; name=20150220_wal_source_check_v8.patchDownload
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***************
*** 2985,2990 **** include_dir 'conf.d'
--- 2985,3008 ----
</listitem>
</varlistentry>
+ <varlistentry id="guc-wal-retrieve-retry-interval" xreflabel="wal_retrieve_retry_interval">
+ <term><varname>wal_retrieve_retry_interval</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>wal_retrieve_retry_interval</> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Specify the amount of time, in milliseconds, to wait when WAL is not
+ available from any sources (streaming replication,
+ local <filename>pg_xlog</> or WAL archive) before retrying to
+ retrieve WAL. This parameter can only be set in the
+ <filename>postgresql.conf</> file or on the server command line.
+ The default value is 5 seconds.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
</sect1>
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 93,98 **** int sync_method = DEFAULT_SYNC_METHOD;
--- 93,99 ----
int wal_level = WAL_LEVEL_MINIMAL;
int CommitDelay = 0; /* precommit delay in microseconds */
int CommitSiblings = 5; /* # concurrent xacts needed to sleep */
+ int wal_retrieve_retry_interval = 5000;
#ifdef WAL_DEBUG
bool XLOG_DEBUG = false;
***************
*** 10340,10347 **** static bool
WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
bool fetching_ckpt, XLogRecPtr tliRecPtr)
{
! static pg_time_t last_fail_time = 0;
! pg_time_t now;
/*-------
* Standby mode is implemented by a state machine:
--- 10341,10348 ----
WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
bool fetching_ckpt, XLogRecPtr tliRecPtr)
{
! static TimestampTz last_fail_time = 0;
! TimestampTz now;
/*-------
* Standby mode is implemented by a state machine:
***************
*** 10351,10357 **** WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
* 2. Check trigger file
* 3. Read from primary server via walreceiver (XLOG_FROM_STREAM)
* 4. Rescan timelines
! * 5. Sleep 5 seconds, and loop back to 1.
*
* Failure to read from the current source advances the state machine to
* the next state.
--- 10352,10358 ----
* 2. Check trigger file
* 3. Read from primary server via walreceiver (XLOG_FROM_STREAM)
* 4. Rescan timelines
! * 5. Sleep wal_retrieve_retry_interval milliseconds, and loop back to 1.
*
* Failure to read from the current source advances the state machine to
* the next state.
***************
*** 10490,10503 **** WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
* machine, so we've exhausted all the options for
* obtaining the requested WAL. We're going to loop back
* and retry from the archive, but if it hasn't been long
! * since last attempt, sleep 5 seconds to avoid
! * busy-waiting.
*/
! now = (pg_time_t) time(NULL);
! if ((now - last_fail_time) < 5)
{
! pg_usleep(1000000L * (5 - (now - last_fail_time)));
! now = (pg_time_t) time(NULL);
}
last_fail_time = now;
currentSource = XLOG_FROM_ARCHIVE;
--- 10491,10515 ----
* machine, so we've exhausted all the options for
* obtaining the requested WAL. We're going to loop back
* and retry from the archive, but if it hasn't been long
! * since last attempt, sleep wal_retrieve_retry_interval
! * milliseconds to avoid busy-waiting.
*/
! now = GetCurrentTimestamp();
! if (!TimestampDifferenceExceeds(last_fail_time, now,
! wal_retrieve_retry_interval))
{
! long secs, wait_time;
! int usecs;
!
! TimestampDifference(last_fail_time, now, &secs, &usecs);
! wait_time = wal_retrieve_retry_interval -
! (secs * 1000 + usecs / 1000);
!
! WaitLatch(&XLogCtl->recoveryWakeupLatch,
! WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
! wait_time);
! ResetLatch(&XLogCtl->recoveryWakeupLatch);
! now = GetCurrentTimestamp();
}
last_fail_time = now;
currentSource = XLOG_FROM_ARCHIVE;
***************
*** 10653,10664 **** WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
}
/*
! * Wait for more WAL to arrive. Time out after 5 seconds,
! * like when polling the archive, to react to a trigger
! * file promptly.
*/
WaitLatch(&XLogCtl->recoveryWakeupLatch,
! WL_LATCH_SET | WL_TIMEOUT,
5000L);
ResetLatch(&XLogCtl->recoveryWakeupLatch);
break;
--- 10665,10675 ----
}
/*
! * Wait for more WAL to arrive. Time out after 5 seconds
! * to react to a trigger file promptly.
*/
WaitLatch(&XLogCtl->recoveryWakeupLatch,
! WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
5000L);
ResetLatch(&XLogCtl->recoveryWakeupLatch);
break;
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
***************
*** 2364,2369 **** static struct config_int ConfigureNamesInt[] =
--- 2364,2381 ----
},
{
+ {"wal_retrieve_retry_interval", PGC_SIGHUP, REPLICATION_STANDBY,
+ gettext_noop("Sets the time to wait before retrying to retrieve WAL"
+ "after a failed attempt."),
+ NULL,
+ GUC_UNIT_MS
+ },
+ &wal_retrieve_retry_interval,
+ 5000, 1, INT_MAX,
+ NULL, NULL, NULL
+ },
+
+ {
{"wal_segment_size", PGC_INTERNAL, PRESET_OPTIONS,
gettext_noop("Shows the number of pages per write ahead log segment."),
NULL,
*** a/src/backend/utils/misc/postgresql.conf.sample
--- b/src/backend/utils/misc/postgresql.conf.sample
***************
*** 260,265 ****
--- 260,267 ----
#wal_receiver_timeout = 60s # time that receiver waits for
# communication from master
# in milliseconds; 0 disables
+ #wal_retrieve_retry_interval = 5s # time to wait before retrying to
+ # retrieve WAL after a failed attempt
#------------------------------------------------------------------------------
*** a/src/include/access/xlog.h
--- b/src/include/access/xlog.h
***************
*** 93,98 **** extern int CheckPointSegments;
--- 93,99 ----
extern int wal_keep_segments;
extern int XLOGbuffers;
extern int XLogArchiveTimeout;
+ extern int wal_retrieve_retry_interval;
extern bool XLogArchiveMode;
extern char *XLogArchiveCommand;
extern bool EnableHotStandby;
On Fri, Feb 20, 2015 at 9:12 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Tue, Feb 10, 2015 at 10:32 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:On Mon, Feb 9, 2015 at 8:29 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Sun, Feb 8, 2015 at 2:54 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:On Fri, Feb 6, 2015 at 4:58 PM, Fujii Masao wrote:
- * Wait for more WAL to arrive. Time out after 5 seconds, - * like when polling the archive, to react to a trigger - * file promptly. + * Wait for more WAL to arrive. Time out after the amount of + * time specified by wal_retrieve_retry_interval, like + * when polling the archive, to react to a trigger file promptly. */ WaitLatch(&XLogCtl->recoveryWakeupLatch, WL_LATCH_SET | WL_TIMEOUT, - 5000L); + wal_retrieve_retry_interval * 1000L);This change can prevent the startup process from reacting to
a trigger file. Imagine the case where the large interval is set
and the user want to promote the standby by using the trigger file
instead of pg_ctl promote. I think that the sleep time should be 5s
if the interval is set to more than 5s. Thought?I disagree here. It is interesting to accelerate the check of WAL
availability from a source in some cases for replication, but the
opposite is true as well as mentioned by Alexey at the beginning of
the thread to reduce the number of requests when requesting WAL
archives from an external storage type AWS. Hence a correct solution
would be to check periodically for the trigger file with a maximum
one-time wait of 5s to ensure backward-compatible behavior. We could
reduce it to 1s or something like that as well.You seem to have misunderstood the code in question. Or I'm missing something.
The timeout of the WaitLatch is just the interval to check for the trigger file
while waiting for more WAL to arrive from streaming replication. Not related to
the retry time to restore WAL from the archive.[Re-reading the code...]
Aah.. Yes you are right. Sorry for the noise. Yes let's wait for a
maximum of 5s then. I also noticed in previous patch that the wait was
maximized to 5s. To begin with, a loop should have been used if it was
a sleep, but as now patch uses a latch this limit does not make much
sense... Patch updated is attached.On second thought, the interval to check the trigger file is very different
from the wait time to retry to retrieve WAL. So it seems strange and even
confusing to control them by one parameter. If we really want to make
the interval for the trigger file configurable, we should invent new GUC for it.
But I don't think that it's worth doing that. If someone wants to react the
trigger file more promptly for "fast" promotion, he or she basically can use
pg_ctl promote command, instead. Thought?
Hm, OK.
Attached is the updated version of the patch. I changed the parameter so that
it doesn't affect the interval of checking the trigger file.- static pg_time_t last_fail_time = 0; - pg_time_t now; + TimestampTz now = GetCurrentTimestamp(); + TimestampTz last_fail_time = now;I reverted the code here as it was. I don't think GetCurrentTimestamp() needs
to be called for each WaitForWALToBecomeAvailable().+ WaitLatch(&XLogCtl->recoveryWakeupLatch, + WL_LATCH_SET | WL_TIMEOUT, + wait_time / 1000);Don't we need to specify WL_POSTMASTER_DEATH flag here? Added.
Yeah, I am wondering though why this has not been added after 89fd72cb though.
+ {"wal_retrieve_retry_interval", PGC_SIGHUP, WAL_SETTINGS,
WAL_SETTINGS should be REPLICATION_STANDBY? Changed.
Sure.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Feb 20, 2015 at 9:33 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
On Fri, Feb 20, 2015 at 9:12 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Tue, Feb 10, 2015 at 10:32 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:On Mon, Feb 9, 2015 at 8:29 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Sun, Feb 8, 2015 at 2:54 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:On Fri, Feb 6, 2015 at 4:58 PM, Fujii Masao wrote:
- * Wait for more WAL to arrive. Time out after 5 seconds, - * like when polling the archive, to react to a trigger - * file promptly. + * Wait for more WAL to arrive. Time out after the amount of + * time specified by wal_retrieve_retry_interval, like + * when polling the archive, to react to a trigger file promptly. */ WaitLatch(&XLogCtl->recoveryWakeupLatch, WL_LATCH_SET | WL_TIMEOUT, - 5000L); + wal_retrieve_retry_interval * 1000L);This change can prevent the startup process from reacting to
a trigger file. Imagine the case where the large interval is set
and the user want to promote the standby by using the trigger file
instead of pg_ctl promote. I think that the sleep time should be 5s
if the interval is set to more than 5s. Thought?I disagree here. It is interesting to accelerate the check of WAL
availability from a source in some cases for replication, but the
opposite is true as well as mentioned by Alexey at the beginning of
the thread to reduce the number of requests when requesting WAL
archives from an external storage type AWS. Hence a correct solution
would be to check periodically for the trigger file with a maximum
one-time wait of 5s to ensure backward-compatible behavior. We could
reduce it to 1s or something like that as well.You seem to have misunderstood the code in question. Or I'm missing something.
The timeout of the WaitLatch is just the interval to check for the trigger file
while waiting for more WAL to arrive from streaming replication. Not related to
the retry time to restore WAL from the archive.[Re-reading the code...]
Aah.. Yes you are right. Sorry for the noise. Yes let's wait for a
maximum of 5s then. I also noticed in previous patch that the wait was
maximized to 5s. To begin with, a loop should have been used if it was
a sleep, but as now patch uses a latch this limit does not make much
sense... Patch updated is attached.On second thought, the interval to check the trigger file is very different
from the wait time to retry to retrieve WAL. So it seems strange and even
confusing to control them by one parameter. If we really want to make
the interval for the trigger file configurable, we should invent new GUC for it.
But I don't think that it's worth doing that. If someone wants to react the
trigger file more promptly for "fast" promotion, he or she basically can use
pg_ctl promote command, instead. Thought?Hm, OK.
Attached is the updated version of the patch. I changed the parameter so that
it doesn't affect the interval of checking the trigger file.- static pg_time_t last_fail_time = 0; - pg_time_t now; + TimestampTz now = GetCurrentTimestamp(); + TimestampTz last_fail_time = now;I reverted the code here as it was. I don't think GetCurrentTimestamp() needs
to be called for each WaitForWALToBecomeAvailable().+ WaitLatch(&XLogCtl->recoveryWakeupLatch, + WL_LATCH_SET | WL_TIMEOUT, + wait_time / 1000);Don't we need to specify WL_POSTMASTER_DEATH flag here? Added.
Yeah, I am wondering though why this has not been added after 89fd72cb though.
+ {"wal_retrieve_retry_interval", PGC_SIGHUP, WAL_SETTINGS,
WAL_SETTINGS should be REPLICATION_STANDBY? Changed.
Sure.
So I pushed the patch.
Regards,
--
Fujii Masao
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers