pgsql: walreceiver uses a temporary replication slot by default
walreceiver uses a temporary replication slot by default
If no permanent replication slot is configured using
primary_slot_name, the walreceiver now creates and uses a temporary
replication slot. A new setting wal_receiver_create_temp_slot can be
used to disable this behavior, for example, if the remote instance is
out of replication slots.
Reviewed-by: Masahiko Sawada <masahiko.sawada@2ndquadrant.com>
Discussion: /messages/by-id/CA+fd4k4dM0iEPLxyVyme2RAFsn8SUgrNtBJOu81YqTY4V+nqZA@mail.gmail.com
Branch
------
master
Details
-------
https://git.postgresql.org/pg/commitdiff/329730827848f61eb8d353d5addcbd885fa823da
Modified Files
--------------
doc/src/sgml/config.sgml | 20 +++++++++++
.../libpqwalreceiver/libpqwalreceiver.c | 4 +++
src/backend/replication/walreceiver.c | 41 ++++++++++++++++++++++
src/backend/utils/misc/guc.c | 9 +++++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/replication/walreceiver.h | 7 ++++
6 files changed, 82 insertions(+)
Hi Peter,
(Adding Andres and Sergei in CC.)
On Tue, Jan 14, 2020 at 01:57:34PM +0000, Peter Eisentraut wrote:
walreceiver uses a temporary replication slot by default
If no permanent replication slot is configured using
primary_slot_name, the walreceiver now creates and uses a temporary
replication slot. A new setting wal_receiver_create_temp_slot can be
used to disable this behavior, for example, if the remote instance is
out of replication slots.
A recent message from Seigei Kornilov has attracted my attention to
this commit:
/messages/by-id/370331579618998@vla3-6a5326aeb4ee.qloud-c.yandex.net
In the thread about switching primary_conninfo to be reloadable, we
have argued at great lengths that we should never have the WAL
receiver fetch by itself the GUC parameters used for the connection
with its primary. Here is the main area of the discussion:
/messages/by-id/20190217192720.qphwrraj66rht5lj@alap3.anarazel.de
The previous thread was long enough so it can easily be missed.
However, it seems to me that we may need to revisit a couple of things
for this commit? In short, the following things:
- wal_receiver_create_temp_slot should be made PGC_POSTMASTER,
similarly to primary_slot_name and primary_conninfo.
- WalReceiverMain() should not load the parameter from the GUC context
by itself.
- RequestXLogStreaming(), called by the startup process, should be in
charge of defining if a temp slot should be used or not.
--
Michael
Hello
In short, the following things:
- wal_receiver_create_temp_slot should be made PGC_POSTMASTER,
similarly to primary_slot_name and primary_conninfo.
- WalReceiverMain() should not load the parameter from the GUC context
by itself.
- RequestXLogStreaming(), called by the startup process, should be in
charge of defining if a temp slot should be used or not.
I would like to cross-post here a patch with such changes that I posted in "allow online change primary_conninfo" thread.
This thread is more appropriate for discussion about wal_receiver_create_temp_slot.
PS: I posted this patch in both threads mostly to make cfbot happy.
regards, Sergei
Attachments:
0001_v7_move_temp_slot_logic_to_startup.patchtext/x-diff; name=0001_v7_move_temp_slot_logic_to_startup.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e07dc01e80..14992a08d7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4137,9 +4137,7 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
has been configured (using <xref linkend="guc-primary-slot-name"/>).
The default is on. The only reason to turn this off would be if the
remote instance is currently out of available replication slots. This
- parameter can only be set in the <filename>postgresql.conf</filename>
- file or on the server command line. Changes only take effect when the
- WAL receiver process starts a new connection.
+ parameter can only be set at server start.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 7f4f784c0e..55e9294ae3 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -283,6 +283,7 @@ bool StandbyModeRequested = false;
char *PrimaryConnInfo = NULL;
char *PrimarySlotName = NULL;
char *PromoteTriggerFile = NULL;
+bool wal_receiver_create_temp_slot = true;
/* are we currently in standby mode? */
bool StandbyMode = false;
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index a5e85d32f3..264b544194 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -73,7 +73,6 @@
/* GUC variables */
-bool wal_receiver_create_temp_slot;
int wal_receiver_status_interval;
int wal_receiver_timeout;
bool hot_standby_feedback;
@@ -349,42 +348,23 @@ WalReceiverMain(void)
WalRcvFetchTimeLineHistoryFiles(startpointTLI, primaryTLI);
/*
- * Create temporary replication slot if no slot name is configured or
- * the slot from the previous run was temporary, unless
- * wal_receiver_create_temp_slot is disabled. We also need to handle
- * the case where the previous run used a temporary slot but
- * wal_receiver_create_temp_slot was changed in the meantime. In that
- * case, we delete the old slot name in shared memory. (This would
+ * Create temporary replication slot if requested. In that
+ * case, we update slot name in shared memory. (This would
* all be a bit easier if we just didn't copy the slot name into
* shared memory, since we won't need it again later, but then we
* can't see the slot name in the stats views.)
*/
- if (slotname[0] == '\0' || is_temp_slot)
+ if (is_temp_slot)
{
- bool changed = false;
+ snprintf(slotname, sizeof(slotname),
+ "pg_walreceiver_%lld",
+ (long long int) walrcv_get_backend_pid(wrconn));
- if (wal_receiver_create_temp_slot)
- {
- snprintf(slotname, sizeof(slotname),
- "pg_walreceiver_%lld",
- (long long int) walrcv_get_backend_pid(wrconn));
-
- walrcv_create_slot(wrconn, slotname, true, 0, NULL);
- changed = true;
- }
- else if (slotname[0] != '\0')
- {
- slotname[0] = '\0';
- changed = true;
- }
+ walrcv_create_slot(wrconn, slotname, true, 0, NULL);
- if (changed)
- {
- SpinLockAcquire(&walrcv->mutex);
- strlcpy(walrcv->slotname, slotname, NAMEDATALEN);
- walrcv->is_temp_slot = wal_receiver_create_temp_slot;
- SpinLockRelease(&walrcv->mutex);
- }
+ SpinLockAcquire(&walrcv->mutex);
+ strlcpy(walrcv->slotname, slotname, NAMEDATALEN);
+ SpinLockRelease(&walrcv->mutex);
}
/*
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index 89c903e45a..a36bc83d2c 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -248,10 +248,16 @@ RequestXLogStreaming(TimeLineID tli, XLogRecPtr recptr, const char *conninfo,
else
walrcv->conninfo[0] = '\0';
- if (slotname != NULL)
+ if (slotname != NULL && slotname[0] != '\0')
+ {
strlcpy((char *) walrcv->slotname, slotname, NAMEDATALEN);
+ walrcv->is_temp_slot = false;
+ }
else
+ {
walrcv->slotname[0] = '\0';
+ walrcv->is_temp_slot = wal_receiver_create_temp_slot;
+ }
if (walrcv->walRcvState == WALRCV_STOPPED)
{
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 9f179a9129..c4b36eb2ad 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1994,7 +1994,7 @@ static struct config_bool ConfigureNamesBool[] =
},
{
- {"wal_receiver_create_temp_slot", PGC_SIGHUP, REPLICATION_STANDBY,
+ {"wal_receiver_create_temp_slot", PGC_POSTMASTER, REPLICATION_STANDBY,
gettext_noop("Sets whether a WAL receiver should create a temporary replication slot if no permanent slot is configured."),
},
&wal_receiver_create_temp_slot,
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e1048c0047..4454407b7c 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -322,6 +322,7 @@
# when reading streaming WAL;
# -1 allows indefinite delay
#wal_receiver_create_temp_slot = on # create temp slot if primary_slot_name not set
+ # (change requires restart)
#wal_receiver_status_interval = 10s # send replies at least this often
# 0 disables
#hot_standby_feedback = off # send info from standby to prevent
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 98b033fc20..12362421d7 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -129,6 +129,7 @@ extern int recoveryTargetAction;
extern int recovery_min_apply_delay;
extern char *PrimaryConnInfo;
extern char *PrimarySlotName;
+extern bool wal_receiver_create_temp_slot;
/* indirectly set via GUC system */
extern TransactionId recoveryTargetXid;
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index e08afc6548..9d25a7f91b 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -23,7 +23,6 @@
#include "utils/tuplestore.h"
/* user-settable parameters */
-extern bool wal_receiver_create_temp_slot;
extern int wal_receiver_status_interval;
extern int wal_receiver_timeout;
extern bool hot_standby_feedback;
On Tue, Jan 14, 2020 at 8:57 AM Peter Eisentraut <peter@eisentraut.org> wrote:
walreceiver uses a temporary replication slot by default
If no permanent replication slot is configured using
primary_slot_name, the walreceiver now creates and uses a temporary
replication slot. A new setting wal_receiver_create_temp_slot can be
used to disable this behavior, for example, if the remote instance is
out of replication slots.Reviewed-by: Masahiko Sawada <masahiko.sawada@2ndquadrant.com>
Discussion: /messages/by-id/CA+fd4k4dM0iEPLxyVyme2RAFsn8SUgrNtBJOu81YqTY4V+nqZA@mail.gmail.com
Neither the commit message for this patch nor any of the comments in
the patch seem to explain why this is a desirable change.
I assume that's probably discussed on the thread that is linked here,
but you shouldn't have to dig through the discussion thread to figure
out what the benefits of a change like this are.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2020-01-23 21:49, Robert Haas wrote:
On Tue, Jan 14, 2020 at 8:57 AM Peter Eisentraut <peter@eisentraut.org> wrote:
walreceiver uses a temporary replication slot by default
If no permanent replication slot is configured using
primary_slot_name, the walreceiver now creates and uses a temporary
replication slot. A new setting wal_receiver_create_temp_slot can be
used to disable this behavior, for example, if the remote instance is
out of replication slots.Reviewed-by: Masahiko Sawada <masahiko.sawada@2ndquadrant.com>
Discussion: /messages/by-id/CA+fd4k4dM0iEPLxyVyme2RAFsn8SUgrNtBJOu81YqTY4V+nqZA@mail.gmail.comNeither the commit message for this patch nor any of the comments in
the patch seem to explain why this is a desirable change.I assume that's probably discussed on the thread that is linked here,
but you shouldn't have to dig through the discussion thread to figure
out what the benefits of a change like this are.
You are right, this has gotten a bit lost in the big thread.
The rationale is basically the same as why client-side tools like
pg_basebackup use a temporary slot: So that the WAL data that they are
interested in doesn't disappear while they are connected.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2020-01-22 06:55, Michael Paquier wrote:
In the thread about switching primary_conninfo to be reloadable, we
have argued at great lengths that we should never have the WAL
receiver fetch by itself the GUC parameters used for the connection
with its primary. Here is the main area of the discussion:
/messages/by-id/20190217192720.qphwrraj66rht5lj@alap3.anarazel.de
The way I understood that discussion was that the issue is having both
the startup process and the WAL receiver having possibly inconsistent
knowledge about the current configuration. That doesn't apply in this
case, because the setting is only used by the WAL receiver. Maybe I
misunderstood.
The previous thread was long enough so it can easily be missed.
However, it seems to me that we may need to revisit a couple of things
for this commit? In short, the following things:
- wal_receiver_create_temp_slot should be made PGC_POSTMASTER,
similarly to primary_slot_name and primary_conninfo.
- WalReceiverMain() should not load the parameter from the GUC context
by itself.
- RequestXLogStreaming(), called by the startup process, should be in
charge of defining if a temp slot should be used or not.
That would be a reasonable fix if we think the above is really an issue.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi,
On 2020-02-10 16:46:04 +0100, Peter Eisentraut wrote:
On 2020-01-22 06:55, Michael Paquier wrote:
In the thread about switching primary_conninfo to be reloadable, we
have argued at great lengths that we should never have the WAL
receiver fetch by itself the GUC parameters used for the connection
with its primary. Here is the main area of the discussion:
/messages/by-id/20190217192720.qphwrraj66rht5lj@alap3.anarazel.deThe way I understood that discussion was that the issue is having both the
startup process and the WAL receiver having possibly inconsistent knowledge
about the current configuration. That doesn't apply in this case, because
the setting is only used by the WAL receiver. Maybe I misunderstood.
Yes, that was my concern there. I do agree there's much less of an issue
here.
I still architecturally don't find it attractive that the active
configuration between walreceiver and startup process can diverge
though. Imagine if we e.g. added the ability to receive WAL over
multiple connections from one host, or from multiple hosts (e.g. to be
able to get the bulk of the WAL from a cascading node, but also to
provide syncrep acknowledgements directly to the primary), or to allow
for logical replication without needing all WAL locally on a standby
doing decoding. It seems not great if there's potentially diverging
configuration (hot standby feedback, temporary slots, ... ) between
those walreceivers, just depending on when they started. Here the model
e.g. paralell workers use, which explicitly ensure that the GUC state is
the same in workers and the leader, is considerably better, imo.
So I think adding more of these parameters affecting walreceivers
without coordination is not going quite in the right direction.
Greetings,
Andres Freund
Hello,
On Mon, 10 Feb 2020 16:37:53 +0100
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 2020-01-23 21:49, Robert Haas wrote:
On Tue, Jan 14, 2020 at 8:57 AM Peter Eisentraut <peter@eisentraut.org>
wrote:walreceiver uses a temporary replication slot by default
If no permanent replication slot is configured using
primary_slot_name, the walreceiver now creates and uses a temporary
replication slot. A new setting wal_receiver_create_temp_slot can be
used to disable this behavior, for example, if the remote instance is
out of replication slots.Reviewed-by: Masahiko Sawada <masahiko.sawada@2ndquadrant.com>
Discussion:
/messages/by-id/CA+fd4k4dM0iEPLxyVyme2RAFsn8SUgrNtBJOu81YqTY4V+nqZA@mail.gmail.comNeither the commit message for this patch nor any of the comments in
the patch seem to explain why this is a desirable change.I assume that's probably discussed on the thread that is linked here,
but you shouldn't have to dig through the discussion thread to figure
out what the benefits of a change like this are.You are right, this has gotten a bit lost in the big thread.
The rationale is basically the same as why client-side tools like
pg_basebackup use a temporary slot: So that the WAL data that they are
interested in doesn't disappear while they are connected.
In my humble opinion, I prefer the previous behavior, streaming without
temporary slot, for one reason: primary availability.
Should the standby lag far behind the primary (no matter the root cause),
the standby was disconnected because of missing WAL. Worst case scenario, we
must rebuild it, hopefully from backups. Best case scenario, it fetches WALs
from PITR backup. As soon as the later is possible in the stack, I consider slot
like a burden from the operability point of view. If standbys can not fetch
archived WAL from PITR, then we can consider slots.
With temp slot created by default, if one standby lag far behind, it can make
the primary unavailable. We have nothing yet to forbid a slot to fill the
pg_wal partition. How new users creating their first cluster would react in such
situation? I suppose the original discussion was mostly targeting them?
Recovering from this is way more scary than building a standby.
So the default behavior might not be desirable and maybe
wal_receiver_create_temp_slot might be off by default?
Note that Kyotaro HORIGUCHI is working on a patch to restricting maximum keep
segments by repslots:
/messages/by-id/20190627162256.4f4872b8@firost
Regards,
On Mon, Feb 10, 2020 at 01:46:04PM -0800, Andres Freund wrote:
I still architecturally don't find it attractive that the active
configuration between walreceiver and startup process can diverge
though. Imagine if we e.g. added the ability to receive WAL over
multiple connections from one host, or from multiple hosts (e.g. to be
able to get the bulk of the WAL from a cascading node, but also to
provide syncrep acknowledgements directly to the primary), or to allow
for logical replication without needing all WAL locally on a standby
doing decoding. It seems not great if there's potentially diverging
configuration (hot standby feedback, temporary slots, ... ) between
those walreceivers, just depending on when they started. Here the model
e.g. parallel workers use, which explicitly ensure that the GUC state is
the same in workers and the leader, is considerably better, imo.
Yes, I still think that we should fix that inconsistency, mark the new
GUC wal_receiver_create_temp_slot as PGC_POSTMASTER, and add a note at
the top of RequestXLogStreaming() and walreceiver.c about the
assumptions we'd prefer rely to for the GUCs starting a WAL receiver.
So I think adding more of these parameters affecting walreceivers
without coordination is not going quite in the right direction.
Indeed. Adding more comments would be one way to prevent the
situation to happen here, I fear that others may forget this stuff in
the future.
--
Michael
On 2020/02/12 7:53, Jehan-Guillaume de Rorthais wrote:
Hello,
On Mon, 10 Feb 2020 16:37:53 +0100
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:On 2020-01-23 21:49, Robert Haas wrote:
On Tue, Jan 14, 2020 at 8:57 AM Peter Eisentraut <peter@eisentraut.org>
wrote:walreceiver uses a temporary replication slot by default
If no permanent replication slot is configured using
primary_slot_name, the walreceiver now creates and uses a temporary
replication slot. A new setting wal_receiver_create_temp_slot can be
used to disable this behavior, for example, if the remote instance is
out of replication slots.Reviewed-by: Masahiko Sawada <masahiko.sawada@2ndquadrant.com>
Discussion:
/messages/by-id/CA+fd4k4dM0iEPLxyVyme2RAFsn8SUgrNtBJOu81YqTY4V+nqZA@mail.gmail.comNeither the commit message for this patch nor any of the comments in
the patch seem to explain why this is a desirable change.I assume that's probably discussed on the thread that is linked here,
but you shouldn't have to dig through the discussion thread to figure
out what the benefits of a change like this are.You are right, this has gotten a bit lost in the big thread.
The rationale is basically the same as why client-side tools like
pg_basebackup use a temporary slot: So that the WAL data that they are
interested in doesn't disappear while they are connected.In my humble opinion, I prefer the previous behavior, streaming without
temporary slot, for one reason: primary availability.
+1
Should the standby lag far behind the primary (no matter the root cause),
the standby was disconnected because of missing WAL. Worst case scenario, we
must rebuild it, hopefully from backups. Best case scenario, it fetches WALs
from PITR backup. As soon as the later is possible in the stack, I consider slot
like a burden from the operability point of view. If standbys can not fetch
archived WAL from PITR, then we can consider slots.With temp slot created by default, if one standby lag far behind, it can make
the primary unavailable. We have nothing yet to forbid a slot to fill the
pg_wal partition. How new users creating their first cluster would react in such
situation? I suppose the original discussion was mostly targeting them?
Recovering from this is way more scary than building a standby.So the default behavior might not be desirable and maybe
wal_receiver_create_temp_slot might be off by default?Note that Kyotaro HORIGUCHI is working on a patch to restricting maximum keep
segments by repslots:
Yeah, I think it's better to disable this option until something like
Horiguchi-san's proposal will have been committed, i.e., until
the upper limit on the number (or size) of WAL files that remain
for slots become configurable.
Regards,
--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters
On Wed, Feb 12, 2020 at 06:11:06PM +0900, Fujii Masao wrote:
On 2020/02/12 7:53, Jehan-Guillaume de Rorthais wrote:
In my humble opinion, I prefer the previous behavior, streaming without
temporary slot, for one reason: primary availability.+1
With temp slot created by default, if one standby lag far behind, it can make
the primary unavailable. We have nothing yet to forbid a slot to fill the
pg_wal partition. How new users creating their first cluster would react in such
situation? I suppose the original discussion was mostly targeting them?
Recovering from this is way more scary than building a standby.So the default behavior might not be desirable and maybe
wal_receiver_create_temp_slot might be off by default?Note that Kyotaro HORIGUCHI is working on a patch to restricting maximum keep
segments by repslots:Yeah, I think it's better to disable this option until something like
Horiguchi-san's proposal will have been committed, i.e., until
the upper limit on the number (or size) of WAL files that remain
for slots become configurable.
Even with that, are we sure this extra feature would be a reason
sufficient to change the default value of this option to be enabled?
I am not sure about that either. My opinion is that this option is
useful to have and that it is not really a problem if you have slot
monitoring on the primary (or a standby for cascading). And I'd like
to believe that it is a common practice lately for base backups,
archivers based on pg_receivewal or even logical decoding, but it
could be surprising for some users who do not do that yet. So
Jehan-Guillaume's arguments sound also sensible to me (he also
maintains an automatic failover solution called PAF).
From what I can see nobody really likes the current state of things
for this option, and that does not come down only to its default
value. The default GUC value and the way the parameter is loaded by
the WAL sender are problematic, still easy enough to fix. How do we
move on from here? I could post a patch based on what Sergei Kornilov
has sent around [1]/messages/by-id/20200122055510.GH174860@paquier.xyz -- Michael, but that's Peter's feature. Any opinions?
[1]: /messages/by-id/20200122055510.GH174860@paquier.xyz -- Michael
--
Michael
At Thu, 13 Feb 2020 16:48:21 +0900, Michael Paquier <michael@paquier.xyz> wrote in
On Wed, Feb 12, 2020 at 06:11:06PM +0900, Fujii Masao wrote:
On 2020/02/12 7:53, Jehan-Guillaume de Rorthais wrote:
In my humble opinion, I prefer the previous behavior, streaming without
temporary slot, for one reason: primary availability.+1
With temp slot created by default, if one standby lag far behind, it can make
the primary unavailable. We have nothing yet to forbid a slot to fill the
pg_wal partition. How new users creating their first cluster would react in such
situation? I suppose the original discussion was mostly targeting them?
Recovering from this is way more scary than building a standby.So the default behavior might not be desirable and maybe
wal_receiver_create_temp_slot might be off by default?Note that Kyotaro HORIGUCHI is working on a patch to restricting maximum keep
segments by repslots:Yeah, I think it's better to disable this option until something like
Horiguchi-san's proposal will have been committed, i.e., until
the upper limit on the number (or size) of WAL files that remain
for slots become configurable.Even with that, are we sure this extra feature would be a reason
sufficient to change the default value of this option to be enabled?
I think the feature (slot limit) is not going to be an reason to
enable it (tmp slot). In the first place I think we cannot determine
the default value generally workable..
I am not sure about that either. My opinion is that this option is
useful to have and that it is not really a problem if you have slot
monitoring on the primary (or a standby for cascading). And I'd like
to believe that it is a common practice lately for base backups,
archivers based on pg_receivewal or even logical decoding, but it
could be surprising for some users who do not do that yet. So
Jehan-Guillaume's arguments sound also sensible to me (he also
maintains an automatic failover solution called PAF).From what I can see nobody really likes the current state of things
for this option, and that does not come down only to its default
value. The default GUC value and the way the parameter is loaded by
the WAL sender are problematic, still easy enough to fix. How do we
move on from here? I could post a patch based on what Sergei Kornilov
has sent around [1], but that's Peter's feature. Any opinions?
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Wed, Jan 22, 2020 at 06:58:46PM +0300, Sergei Kornilov wrote:
I would like to cross-post here a patch with such changes that I posted in "allow online change primary_conninfo" thread.
This thread is more appropriate for discussion about wal_receiver_create_temp_slot.PS: I posted this patch in both threads mostly to make cfbot happy.
Thanks for posting this patch, Sergei. Here is a review to make
things move on.
- * Create temporary replication slot if no slot name is configured or
- * the slot from the previous run was temporary, unless
- * wal_receiver_create_temp_slot is disabled. We also need to handle
- * the case where the previous run used a temporary slot but
- * wal_receiver_create_temp_slot was changed in the meantime. In that
- * case, we delete the old slot name in shared memory. (This would
+ * Create temporary replication slot if requested. In that
+ * case, we update slot name in shared memory. (This would
The set of comments you are removing from walreceiver.c to decide if a
temporary slot needs to be created or not should be moved to
walreceiverfuncs.c as you move the logic from the WAL receiver startup
phase to the moment the WAL receiver spawn is requested.
I agree with the simplifications in WalReceiverMain() as you have
switched wal_receiver_create_temp_slot to be PGC_POSTMASTER, so
modifications are no longer a state that matter.
It would be more consistent with primary_conn_info and
primary_slot_name if wal_receiver_create_temp_slot is passed down as
an argument of RequestXLogStreaming().
As per the discussion done on this thread, let's also switch the
parameter default to be disabled. Peter, as the committer of 3297308,
it would be good if you could chime in.
--
Michael
Hello
Thanks for posting this patch, Sergei. Here is a review to make
things move on.
Thank you, here is updated patch
The set of comments you are removing from walreceiver.c to decide if a
temporary slot needs to be created or not should be moved to
walreceiverfuncs.c as you move the logic from the WAL receiver startup
phase to the moment the WAL receiver spawn is requested.
I changed this comments because they describes behavior during change value of wal_receiver_create_temp_slot.
But yes, I need to add some comments to RequestXLogStreaming.
It would be more consistent with primary_conn_info and
primary_slot_name if wal_receiver_create_temp_slot is passed down as
an argument of RequestXLogStreaming().
Yep, I thought about that. Changed.
As per the discussion done on this thread, let's also switch the
parameter default to be disabled.
Done (my vote is also for disabling this option by default).
regards, Sergei
Attachments:
0001-v2-change-wal_receiver_create_temp_slot.patchtext/x-diff; name=0001-v2-change-wal_receiver_create_temp_slot.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index c1128f89ec..683d87c491 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4140,11 +4140,7 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
Specifies whether a WAL receiver should create a temporary replication
slot on the remote instance when no permanent replication slot to use
has been configured (using <xref linkend="guc-primary-slot-name"/>).
- The default is on. The only reason to turn this off would be if the
- remote instance is currently out of available replication slots. This
- parameter can only be set in the <filename>postgresql.conf</filename>
- file or on the server command line. Changes only take effect when the
- WAL receiver process starts a new connection.
+ The default is off. This parameter can only be set at server start.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 3813eadfb4..e0f3ed5c2a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -283,6 +283,7 @@ bool StandbyModeRequested = false;
char *PrimaryConnInfo = NULL;
char *PrimarySlotName = NULL;
char *PromoteTriggerFile = NULL;
+bool wal_receiver_create_temp_slot = false;
/* are we currently in standby mode? */
bool StandbyMode = false;
@@ -11901,7 +11902,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
}
curFileTLI = tli;
RequestXLogStreaming(tli, ptr, PrimaryConnInfo,
- PrimarySlotName);
+ PrimarySlotName, wal_receiver_create_temp_slot);
receivedUpto = 0;
}
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 2ab15c3cbb..01452a175f 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -73,7 +73,6 @@
/* GUC variables */
-bool wal_receiver_create_temp_slot;
int wal_receiver_status_interval;
int wal_receiver_timeout;
bool hot_standby_feedback;
@@ -348,42 +347,23 @@ WalReceiverMain(void)
WalRcvFetchTimeLineHistoryFiles(startpointTLI, primaryTLI);
/*
- * Create temporary replication slot if no slot name is configured or
- * the slot from the previous run was temporary, unless
- * wal_receiver_create_temp_slot is disabled. We also need to handle
- * the case where the previous run used a temporary slot but
- * wal_receiver_create_temp_slot was changed in the meantime. In that
- * case, we delete the old slot name in shared memory. (This would
+ * Create temporary replication slot if requested. In that
+ * case, we update slot name in shared memory. (This would
* all be a bit easier if we just didn't copy the slot name into
* shared memory, since we won't need it again later, but then we
* can't see the slot name in the stats views.)
*/
- if (slotname[0] == '\0' || is_temp_slot)
+ if (is_temp_slot)
{
- bool changed = false;
+ snprintf(slotname, sizeof(slotname),
+ "pg_walreceiver_%lld",
+ (long long int) walrcv_get_backend_pid(wrconn));
- if (wal_receiver_create_temp_slot)
- {
- snprintf(slotname, sizeof(slotname),
- "pg_walreceiver_%lld",
- (long long int) walrcv_get_backend_pid(wrconn));
-
- walrcv_create_slot(wrconn, slotname, true, 0, NULL);
- changed = true;
- }
- else if (slotname[0] != '\0')
- {
- slotname[0] = '\0';
- changed = true;
- }
+ walrcv_create_slot(wrconn, slotname, true, 0, NULL);
- if (changed)
- {
- SpinLockAcquire(&walrcv->mutex);
- strlcpy(walrcv->slotname, slotname, NAMEDATALEN);
- walrcv->is_temp_slot = wal_receiver_create_temp_slot;
- SpinLockRelease(&walrcv->mutex);
- }
+ SpinLockAcquire(&walrcv->mutex);
+ strlcpy(walrcv->slotname, slotname, NAMEDATALEN);
+ SpinLockRelease(&walrcv->mutex);
}
/*
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index 89c903e45a..ee0dd693b2 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -216,12 +216,13 @@ ShutdownWalRcv(void)
* Request postmaster to start walreceiver.
*
* recptr indicates the position where streaming should begin, conninfo
- * is a libpq connection string to use, and slotname is, optionally, the name
- * of a replication slot to acquire.
+ * is a libpq connection string to use, slotname is, optionally, the name
+ * of a replication slot to acquire, create_temp_slot indicates to create
+ * a temporary slot when no slotname was given.
*/
void
RequestXLogStreaming(TimeLineID tli, XLogRecPtr recptr, const char *conninfo,
- const char *slotname)
+ const char *slotname, bool create_temp_slot)
{
WalRcvData *walrcv = WalRcv;
bool launch = false;
@@ -248,10 +249,22 @@ RequestXLogStreaming(TimeLineID tli, XLogRecPtr recptr, const char *conninfo,
else
walrcv->conninfo[0] = '\0';
- if (slotname != NULL)
+ /*
+ * Use configured replication slot if present. We ignore create_temp_slot
+ * option here because slotname should be persistent. Otherwise, use
+ * create_temp_slot to determine whether walreceiver should create a
+ * temporary slot itself or not use replication slots at all.
+ */
+ if (slotname != NULL && slotname[0] != '\0')
+ {
strlcpy((char *) walrcv->slotname, slotname, NAMEDATALEN);
+ walrcv->is_temp_slot = false;
+ }
else
+ {
walrcv->slotname[0] = '\0';
+ walrcv->is_temp_slot = create_temp_slot;
+ }
if (walrcv->walRcvState == WALRCV_STOPPED)
{
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 8228e1f390..ec9cd3696c 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2029,11 +2029,11 @@ static struct config_bool ConfigureNamesBool[] =
},
{
- {"wal_receiver_create_temp_slot", PGC_SIGHUP, REPLICATION_STANDBY,
+ {"wal_receiver_create_temp_slot", PGC_POSTMASTER, REPLICATION_STANDBY,
gettext_noop("Sets whether a WAL receiver should create a temporary replication slot if no permanent slot is configured."),
},
&wal_receiver_create_temp_slot,
- true,
+ false,
NULL, NULL, NULL
},
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e1048c0047..d2604a98b4 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -321,7 +321,8 @@
#max_standby_streaming_delay = 30s # max delay before canceling queries
# when reading streaming WAL;
# -1 allows indefinite delay
-#wal_receiver_create_temp_slot = on # create temp slot if primary_slot_name not set
+#wal_receiver_create_temp_slot = off # create temp slot if primary_slot_name not set
+ # (change requires restart)
#wal_receiver_status_interval = 10s # send replies at least this often
# 0 disables
#hot_standby_feedback = off # send info from standby to prevent
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 98b033fc20..12362421d7 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -129,6 +129,7 @@ extern int recoveryTargetAction;
extern int recovery_min_apply_delay;
extern char *PrimaryConnInfo;
extern char *PrimarySlotName;
+extern bool wal_receiver_create_temp_slot;
/* indirectly set via GUC system */
extern TransactionId recoveryTargetXid;
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index e08afc6548..cf3e43128c 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -23,7 +23,6 @@
#include "utils/tuplestore.h"
/* user-settable parameters */
-extern bool wal_receiver_create_temp_slot;
extern int wal_receiver_status_interval;
extern int wal_receiver_timeout;
extern bool hot_standby_feedback;
@@ -321,7 +320,8 @@ extern void ShutdownWalRcv(void);
extern bool WalRcvStreaming(void);
extern bool WalRcvRunning(void);
extern void RequestXLogStreaming(TimeLineID tli, XLogRecPtr recptr,
- const char *conninfo, const char *slotname);
+ const char *conninfo, const char *slotname,
+ bool create_temp_slot);
extern XLogRecPtr GetWalRcvWriteRecPtr(XLogRecPtr *latestChunkStart, TimeLineID *receiveTLI);
extern int GetReplicationApplyDelay(void);
extern int GetReplicationTransferLatency(void);
On Mon, Feb 17, 2020 at 04:57:04PM +0300, Sergei Kornilov wrote:
Thank you, here is updated patch
Thanks
I changed this comments because they describes behavior during
change value of wal_receiver_create_temp_slot. But yes, I need to
add some comments to RequestXLogStreaming.
I have reworked that part, adding more comments about the use of GUC
parameters when establishing the connection to the primary for a WAL
receiver. And also I have added an extra comment to walreceiver.c
about the use of GUcs in general, to avoid this stuff again in the
future. There were some extra nits with the format of
postgresql.conf.sample.
As per the discussion done on this thread, let's also switch the
parameter default to be disabled.Done (my vote is also for disabling this option by default).
We visibly tend to move in this direction, at least based on our
discussion. Let's see where this leads. For now, I have registered
this patch to next CF (https://commitfest.postgresql.org/27/2456/),
with yourself as author and myself as reviewer, and then let's wait
for mainly Peter E. and others for more input.
--
Michael
Attachments:
0001-v3-change-wal_receiver_create_temp_slot.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 98b033fc20..12362421d7 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -129,6 +129,7 @@ extern int recoveryTargetAction;
extern int recovery_min_apply_delay;
extern char *PrimaryConnInfo;
extern char *PrimarySlotName;
+extern bool wal_receiver_create_temp_slot;
/* indirectly set via GUC system */
extern TransactionId recoveryTargetXid;
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index e08afc6548..cf3e43128c 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -23,7 +23,6 @@
#include "utils/tuplestore.h"
/* user-settable parameters */
-extern bool wal_receiver_create_temp_slot;
extern int wal_receiver_status_interval;
extern int wal_receiver_timeout;
extern bool hot_standby_feedback;
@@ -321,7 +320,8 @@ extern void ShutdownWalRcv(void);
extern bool WalRcvStreaming(void);
extern bool WalRcvRunning(void);
extern void RequestXLogStreaming(TimeLineID tli, XLogRecPtr recptr,
- const char *conninfo, const char *slotname);
+ const char *conninfo, const char *slotname,
+ bool create_temp_slot);
extern XLogRecPtr GetWalRcvWriteRecPtr(XLogRecPtr *latestChunkStart, TimeLineID *receiveTLI);
extern int GetReplicationApplyDelay(void);
extern int GetReplicationTransferLatency(void);
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 3813eadfb4..e0f3ed5c2a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -283,6 +283,7 @@ bool StandbyModeRequested = false;
char *PrimaryConnInfo = NULL;
char *PrimarySlotName = NULL;
char *PromoteTriggerFile = NULL;
+bool wal_receiver_create_temp_slot = false;
/* are we currently in standby mode? */
bool StandbyMode = false;
@@ -11901,7 +11902,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
}
curFileTLI = tli;
RequestXLogStreaming(tli, ptr, PrimaryConnInfo,
- PrimarySlotName);
+ PrimarySlotName, wal_receiver_create_temp_slot);
receivedUpto = 0;
}
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 2ab15c3cbb..ff45482faa 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -15,6 +15,12 @@
* WalRcv->receivedUpto variable in shared memory, to inform the startup
* process of how far it can proceed with XLOG replay.
*
+ * WAL receivers cannot load directly GUC parameters used when establishing
+ * their connection to the primary, and rely on parameter values passed down
+ * by the startup process when WAL streaming is requested. This applies
+ * to for example the replication slot creation and the connection string to
+ * use for the connection with the primary.
+ *
* If the primary server ends streaming, but doesn't disconnect, walreceiver
* goes into "waiting" mode, and waits for the startup process to give new
* instructions. The startup process will treat that the same as
@@ -73,7 +79,6 @@
/* GUC variables */
-bool wal_receiver_create_temp_slot;
int wal_receiver_status_interval;
int wal_receiver_timeout;
bool hot_standby_feedback;
@@ -348,42 +353,23 @@ WalReceiverMain(void)
WalRcvFetchTimeLineHistoryFiles(startpointTLI, primaryTLI);
/*
- * Create temporary replication slot if no slot name is configured or
- * the slot from the previous run was temporary, unless
- * wal_receiver_create_temp_slot is disabled. We also need to handle
- * the case where the previous run used a temporary slot but
- * wal_receiver_create_temp_slot was changed in the meantime. In that
- * case, we delete the old slot name in shared memory. (This would
+ * Create temporary replication slot if requested. In that
+ * case, we update slot name in shared memory. (This would
* all be a bit easier if we just didn't copy the slot name into
* shared memory, since we won't need it again later, but then we
* can't see the slot name in the stats views.)
*/
- if (slotname[0] == '\0' || is_temp_slot)
+ if (is_temp_slot)
{
- bool changed = false;
+ snprintf(slotname, sizeof(slotname),
+ "pg_walreceiver_%lld",
+ (long long int) walrcv_get_backend_pid(wrconn));
- if (wal_receiver_create_temp_slot)
- {
- snprintf(slotname, sizeof(slotname),
- "pg_walreceiver_%lld",
- (long long int) walrcv_get_backend_pid(wrconn));
+ walrcv_create_slot(wrconn, slotname, true, 0, NULL);
- walrcv_create_slot(wrconn, slotname, true, 0, NULL);
- changed = true;
- }
- else if (slotname[0] != '\0')
- {
- slotname[0] = '\0';
- changed = true;
- }
-
- if (changed)
- {
- SpinLockAcquire(&walrcv->mutex);
- strlcpy(walrcv->slotname, slotname, NAMEDATALEN);
- walrcv->is_temp_slot = wal_receiver_create_temp_slot;
- SpinLockRelease(&walrcv->mutex);
- }
+ SpinLockAcquire(&walrcv->mutex);
+ strlcpy(walrcv->slotname, slotname, NAMEDATALEN);
+ SpinLockRelease(&walrcv->mutex);
}
/*
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index 89c903e45a..821680a857 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -215,13 +215,19 @@ ShutdownWalRcv(void)
/*
* Request postmaster to start walreceiver.
*
- * recptr indicates the position where streaming should begin, conninfo
- * is a libpq connection string to use, and slotname is, optionally, the name
- * of a replication slot to acquire.
+ * "recptr" indicates the position where streaming should begin. "conninfo"
+ * is a libpq connection string to use. "slotname" is, optionally, the name
+ * of a replication slot to acquire. "create_temp_slot" indicates to create
+ * a temporary slot when no "slotname" is given.
+ *
+ * WAL receivers do not directly load GUC parameters used for the connection
+ * to the primary, and rely on the values passed down by the caller of this
+ * routine instead. Hence, the addition of any new parameters should happen
+ * through this code path.
*/
void
RequestXLogStreaming(TimeLineID tli, XLogRecPtr recptr, const char *conninfo,
- const char *slotname)
+ const char *slotname, bool create_temp_slot)
{
WalRcvData *walrcv = WalRcv;
bool launch = false;
@@ -248,10 +254,22 @@ RequestXLogStreaming(TimeLineID tli, XLogRecPtr recptr, const char *conninfo,
else
walrcv->conninfo[0] = '\0';
+ /*
+ * Use configured replication slot if present, and ignore the value
+ * of create_temp_slot as the slot name should be persistent. Otherwise,
+ * use create_temp_slot to determine whether this WAL receiver should
+ * create a temporary slot by itself and use it, or not.
+ */
if (slotname != NULL)
+ {
strlcpy((char *) walrcv->slotname, slotname, NAMEDATALEN);
+ walrcv->is_temp_slot = false;
+ }
else
+ {
walrcv->slotname[0] = '\0';
+ walrcv->is_temp_slot = create_temp_slot;
+ }
if (walrcv->walRcvState == WALRCV_STOPPED)
{
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 8228e1f390..ec9cd3696c 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2029,11 +2029,11 @@ static struct config_bool ConfigureNamesBool[] =
},
{
- {"wal_receiver_create_temp_slot", PGC_SIGHUP, REPLICATION_STANDBY,
+ {"wal_receiver_create_temp_slot", PGC_POSTMASTER, REPLICATION_STANDBY,
gettext_noop("Sets whether a WAL receiver should create a temporary replication slot if no permanent slot is configured."),
},
&wal_receiver_create_temp_slot,
- true,
+ false,
NULL, NULL, NULL
},
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e1048c0047..1dd2a86567 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -321,7 +321,9 @@
#max_standby_streaming_delay = 30s # max delay before canceling queries
# when reading streaming WAL;
# -1 allows indefinite delay
-#wal_receiver_create_temp_slot = on # create temp slot if primary_slot_name not set
+#wal_receiver_create_temp_slot = off # Create temp slot if primary_slot_name
+ # is not set.
+ # (change requires restart)
#wal_receiver_status_interval = 10s # send replies at least this often
# 0 disables
#hot_standby_feedback = off # send info from standby to prevent
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index c1128f89ec..683d87c491 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4140,11 +4140,7 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
Specifies whether a WAL receiver should create a temporary replication
slot on the remote instance when no permanent replication slot to use
has been configured (using <xref linkend="guc-primary-slot-name"/>).
- The default is on. The only reason to turn this off would be if the
- remote instance is currently out of available replication slots. This
- parameter can only be set in the <filename>postgresql.conf</filename>
- file or on the server command line. Changes only take effect when the
- WAL receiver process starts a new connection.
+ The default is off. This parameter can only be set at server start.
</para>
</listitem>
</varlistentry>
Hello
I have reworked that part, adding more comments about the use of GUC
parameters when establishing the connection to the primary for a WAL
receiver. And also I have added an extra comment to walreceiver.c
about the use of GUcs in general, to avoid this stuff again in the
future. There were some extra nits with the format of
postgresql.conf.sample.
Thank you! I just noticed that you removed my proposed change to this condition in RequestXLogStreaming
- if (slotname != NULL)
+ if (slotname != NULL && slotname[0] != '\0')
We need this change to set is_temp_slot properly. PrimarySlotName GUC can usually be an empty string, so just "slotname != NULL" is not enough.
I attached patch with this change.
regards, Sergei
Attachments:
0001-v3-change-wal_receiver_create_temp_slot.patchtext/x-diff; name=0001-v3-change-wal_receiver_create_temp_slot.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 672bf6f1ee..8b742c83c5 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4160,11 +4160,7 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
Specifies whether a WAL receiver should create a temporary replication
slot on the remote instance when no permanent replication slot to use
has been configured (using <xref linkend="guc-primary-slot-name"/>).
- The default is on. The only reason to turn this off would be if the
- remote instance is currently out of available replication slots. This
- parameter can only be set in the <filename>postgresql.conf</filename>
- file or on the server command line. Changes only take effect when the
- WAL receiver process starts a new connection.
+ The default is off. This parameter can only be set at server start.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index de2d4ee582..483fedb218 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -284,6 +284,7 @@ bool StandbyModeRequested = false;
char *PrimaryConnInfo = NULL;
char *PrimarySlotName = NULL;
char *PromoteTriggerFile = NULL;
+bool wal_receiver_create_temp_slot = false;
/* are we currently in standby mode? */
bool StandbyMode = false;
@@ -11951,7 +11952,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
}
curFileTLI = tli;
RequestXLogStreaming(tli, ptr, PrimaryConnInfo,
- PrimarySlotName);
+ PrimarySlotName, wal_receiver_create_temp_slot);
receivedUpto = 0;
}
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 25e0333c9e..779d19f1c1 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -15,6 +15,12 @@
* WalRcv->receivedUpto variable in shared memory, to inform the startup
* process of how far it can proceed with XLOG replay.
*
+ * WAL receivers cannot load directly GUC parameters used when establishing
+ * their connection to the primary, and rely on parameter values passed down
+ * by the startup process when WAL streaming is requested. This applies
+ * to for example the replication slot creation and the connection string to
+ * use for the connection with the primary.
+ *
* If the primary server ends streaming, but doesn't disconnect, walreceiver
* goes into "waiting" mode, and waits for the startup process to give new
* instructions. The startup process will treat that the same as
@@ -74,7 +80,6 @@
/* GUC variables */
-bool wal_receiver_create_temp_slot;
int wal_receiver_status_interval;
int wal_receiver_timeout;
bool hot_standby_feedback;
@@ -349,42 +354,23 @@ WalReceiverMain(void)
WalRcvFetchTimeLineHistoryFiles(startpointTLI, primaryTLI);
/*
- * Create temporary replication slot if no slot name is configured or
- * the slot from the previous run was temporary, unless
- * wal_receiver_create_temp_slot is disabled. We also need to handle
- * the case where the previous run used a temporary slot but
- * wal_receiver_create_temp_slot was changed in the meantime. In that
- * case, we delete the old slot name in shared memory. (This would
+ * Create temporary replication slot if requested. In that
+ * case, we update slot name in shared memory. (This would
* all be a bit easier if we just didn't copy the slot name into
* shared memory, since we won't need it again later, but then we
* can't see the slot name in the stats views.)
*/
- if (slotname[0] == '\0' || is_temp_slot)
+ if (is_temp_slot)
{
- bool changed = false;
+ snprintf(slotname, sizeof(slotname),
+ "pg_walreceiver_%lld",
+ (long long int) walrcv_get_backend_pid(wrconn));
- if (wal_receiver_create_temp_slot)
- {
- snprintf(slotname, sizeof(slotname),
- "pg_walreceiver_%lld",
- (long long int) walrcv_get_backend_pid(wrconn));
+ walrcv_create_slot(wrconn, slotname, true, 0, NULL);
- walrcv_create_slot(wrconn, slotname, true, 0, NULL);
- changed = true;
- }
- else if (slotname[0] != '\0')
- {
- slotname[0] = '\0';
- changed = true;
- }
-
- if (changed)
- {
- SpinLockAcquire(&walrcv->mutex);
- strlcpy(walrcv->slotname, slotname, NAMEDATALEN);
- walrcv->is_temp_slot = wal_receiver_create_temp_slot;
- SpinLockRelease(&walrcv->mutex);
- }
+ SpinLockAcquire(&walrcv->mutex);
+ strlcpy(walrcv->slotname, slotname, NAMEDATALEN);
+ SpinLockRelease(&walrcv->mutex);
}
/*
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index 89c903e45a..21d1823607 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -215,13 +215,19 @@ ShutdownWalRcv(void)
/*
* Request postmaster to start walreceiver.
*
- * recptr indicates the position where streaming should begin, conninfo
- * is a libpq connection string to use, and slotname is, optionally, the name
- * of a replication slot to acquire.
+ * "recptr" indicates the position where streaming should begin. "conninfo"
+ * is a libpq connection string to use. "slotname" is, optionally, the name
+ * of a replication slot to acquire. "create_temp_slot" indicates to create
+ * a temporary slot when no "slotname" is given.
+ *
+ * WAL receivers do not directly load GUC parameters used for the connection
+ * to the primary, and rely on the values passed down by the caller of this
+ * routine instead. Hence, the addition of any new parameters should happen
+ * through this code path.
*/
void
RequestXLogStreaming(TimeLineID tli, XLogRecPtr recptr, const char *conninfo,
- const char *slotname)
+ const char *slotname, bool create_temp_slot)
{
WalRcvData *walrcv = WalRcv;
bool launch = false;
@@ -248,10 +254,22 @@ RequestXLogStreaming(TimeLineID tli, XLogRecPtr recptr, const char *conninfo,
else
walrcv->conninfo[0] = '\0';
- if (slotname != NULL)
+ /*
+ * Use configured replication slot if present, and ignore the value
+ * of create_temp_slot as the slot name should be persistent. Otherwise,
+ * use create_temp_slot to determine whether this WAL receiver should
+ * create a temporary slot by itself and use it, or not.
+ */
+ if (slotname != NULL && slotname[0] != '\0')
+ {
strlcpy((char *) walrcv->slotname, slotname, NAMEDATALEN);
+ walrcv->is_temp_slot = false;
+ }
else
+ {
walrcv->slotname[0] = '\0';
+ walrcv->is_temp_slot = create_temp_slot;
+ }
if (walrcv->walRcvState == WALRCV_STOPPED)
{
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 68082315ac..80b6801c82 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2030,11 +2030,11 @@ static struct config_bool ConfigureNamesBool[] =
},
{
- {"wal_receiver_create_temp_slot", PGC_SIGHUP, REPLICATION_STANDBY,
+ {"wal_receiver_create_temp_slot", PGC_POSTMASTER, REPLICATION_STANDBY,
gettext_noop("Sets whether a WAL receiver should create a temporary replication slot if no permanent slot is configured."),
},
&wal_receiver_create_temp_slot,
- true,
+ false,
NULL, NULL, NULL
},
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index aa44f0c9bf..f2e55d1bd3 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -321,7 +321,9 @@
#max_standby_streaming_delay = 30s # max delay before canceling queries
# when reading streaming WAL;
# -1 allows indefinite delay
-#wal_receiver_create_temp_slot = on # create temp slot if primary_slot_name not set
+#wal_receiver_create_temp_slot = off # Create temp slot if primary_slot_name
+ # is not set.
+ # (change requires restart)
#wal_receiver_status_interval = 10s # send replies at least this often
# 0 disables
#hot_standby_feedback = off # send info from standby to prevent
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 98b033fc20..12362421d7 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -129,6 +129,7 @@ extern int recoveryTargetAction;
extern int recovery_min_apply_delay;
extern char *PrimaryConnInfo;
extern char *PrimarySlotName;
+extern bool wal_receiver_create_temp_slot;
/* indirectly set via GUC system */
extern TransactionId recoveryTargetXid;
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index e08afc6548..cf3e43128c 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -23,7 +23,6 @@
#include "utils/tuplestore.h"
/* user-settable parameters */
-extern bool wal_receiver_create_temp_slot;
extern int wal_receiver_status_interval;
extern int wal_receiver_timeout;
extern bool hot_standby_feedback;
@@ -321,7 +320,8 @@ extern void ShutdownWalRcv(void);
extern bool WalRcvStreaming(void);
extern bool WalRcvRunning(void);
extern void RequestXLogStreaming(TimeLineID tli, XLogRecPtr recptr,
- const char *conninfo, const char *slotname);
+ const char *conninfo, const char *slotname,
+ bool create_temp_slot);
extern XLogRecPtr GetWalRcvWriteRecPtr(XLogRecPtr *latestChunkStart, TimeLineID *receiveTLI);
extern int GetReplicationApplyDelay(void);
extern int GetReplicationTransferLatency(void);
On Tue, Mar 17, 2020 at 11:39:11PM +0300, Sergei Kornilov wrote:
We need this change to set is_temp_slot properly. PrimarySlotName
GUC can usually be an empty string, so just "slotname != NULL" is
not enough.
Yep, or a temporary slot would never be created even if there is no
slot defined, and the priority goes to primary_slot_name if set.
I attached patch with this change.
Thanks, I have added a new open item for v13 to track this effort:
https://wiki.postgresql.org/wiki/PostgreSQL_13_Open_Items
--
Michael