Issue with logical replication slot during switchover
Hi,
An issue occurred during the initial switchover using PostgreSQL version
17.5. The setup consists of a cluster with two nodes, managed by Patroni
version 4.0.5.
Logical replication is configured on the same instance, and the new feature
enabling logical replication slots to be failover-safe in a highly
available environment is used. Logical slot management is currently
disabled in Patroni.
Following are some screen captured during the swichover
1. Run the switchover with Patroni
patronictl switchover
Current cluster topology
+ Cluster: ClusterX (7529893278186104053) ----+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+----------+--------------+---------+-----------+----+-----------+
| node_1 | xxxxxxxxxxxx | Leader | running | 4 | |
| node_2 | xxxxxxxxxxxx | Replica | streaming | 4 | 0 |
+----------+--------------+---------+-----------+----+-----------+
2. Check the slot on the new Primary
select * from pg_replication_slots where slot_type = 'logical';
+-[ RECORD 1 ]--------+----------------+
| slot_name | logical_slot |
| plugin | pgoutput |
| slot_type | logical |
| datoid | 25605 |
| database | db_test |
| temporary | f |
| active | t |
| active_pid | 3841546 |
| xmin | |
| catalog_xmin | 10399 |
| restart_lsn | 0/37002410 |
| confirmed_flush_lsn | 0/37002448 |
| wal_status | reserved |
| safe_wal_size | |
| two_phase | f |
| inactive_since | |
| conflicting | f |
| invalidation_reason | |
| failover | t |
| synced | t |
+---------------------+----------------+
Logical replication is active again after the promote
3. Check the slot on the new standby
select * from pg_replication_slots where slot_type = 'logical';
+-[ RECORD 1 ]--------+-------------------------------+
| slot_name | logical_slot |
| plugin | pgoutput |
| slot_type | logical |
| datoid | 25605 |
| database | db_test |
| temporary | f |
| active | f |
| active_pid | |
| xmin | |
| catalog_xmin | 10397 |
| restart_lsn | 0/3638F5F0 |
| confirmed_flush_lsn | 0/3638F6A0 |
| wal_status | reserved |
| safe_wal_size | |
| two_phase | f |
| inactive_since | 2025-08-05 10:21:03.342587+02 |
| conflicting | f |
| invalidation_reason | |
| failover | t |
| synced | f |
+---------------------+---------------------------
The synced flag keep value false.
Following error in in the log
2025-06-10 16:40:58.996 CEST [739829]: [1-1] user=,db=,client=,application=
LOG: slot sync worker started
2025-06-10 16:40:59.011 CEST [739829]: [2-1] user=,db=,client=,application=
ERROR: exiting from slot synchronization because same name slot
"logical_slot" already exists on the standby
I would like to make a proposal to address the issue:
Since the logical slot is in a failover state on both the primary and the
standby, an attempt could be made to resynchronize them.
I modify the slotsync.c module
+++ b/src/backend/replication/logical/slotsync.c
@@ -649,24 +649,46 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
remote_dbid)
return false;
}
-
- /* Search for the named slot */
+ // Both local and remote slot have the same name
if ((slot = SearchNamedReplicationSlot(remote_slot->name, true)))
{
bool synced;
+ bool failover_status = remote_slot->failover;
SpinLockAcquire(&slot->mutex);
synced = slot->data.synced;
SpinLockRelease(&slot->mutex);
+
+ if (!synced){
+
+ Assert(!MyReplicationSlot);
+
+ if (failover_status){
+
+ ReplicationSlotAcquire(remote_slot->name,
true, true);
+
+ // Put the synced flag to true to attempt
resynchronizing failover slot on the standby
+ MyReplicationSlot->data.synced = true;
+
+ ReplicationSlotMarkDirty();
- /* User-created slot with the same name exists, raise
ERROR. */
- if (!synced)
- ereport(ERROR,
+ ReplicationSlotRelease();
+
+ /* Get rid of a replication slot that is no
longer wanted */
+ ereport(WARNING,
+
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("slot \"%s\" local slot has
the same name as remote slot and they are in failover mode, try to
synchronize them",
+ remote_slot->name));
+ return false; /* Going back to the main
loop after droping the failover slot */
+ }
+ else
+ /* User-created slot with the same name
exists, raise ERROR. */
+ ereport(ERROR,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("exiting from slot
synchronization because same"
- " name slot \"%s\"
already exists on the standby",
- remote_slot->name));
-
+ " name slot \"%s\"
already exists on the standby",
+
remote_slot->name));
+ }
/*
* The slot has been synchronized before.
*
This message follows the discussions started in this thread:
https://www.postgresql.org/message-id/CAA5-nLDvnqGtBsKu4T_s-cS%2BdGbpSLEzRwgep1XfYzGhQ4o65A%40mail.gmail.com
Help would be appreciated to move this point forward
Best regards,
Fabrice
On Thu, Aug 7, 2025 at 6:50 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:
Hi,
An issue occurred during the initial switchover using PostgreSQL version 17.5. The setup consists of a cluster with two nodes, managed by Patroni version 4.0.5.
Logical replication is configured on the same instance, and the new feature enabling logical replication slots to be failover-safe in a highly available environment is used. Logical slot management is currently disabled in Patroni.Following are some screen captured during the swichover
1. Run the switchover with Patroni
patronictl switchover
Current cluster topology
+ Cluster: ClusterX (7529893278186104053) ----+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+----------+--------------+---------+-----------+----+-----------+
| node_1 | xxxxxxxxxxxx | Leader | running | 4 | |
| node_2 | xxxxxxxxxxxx | Replica | streaming | 4 | 0 |
+----------+--------------+---------+-----------+----+-----------+
2. Check the slot on the new Primary
select * from pg_replication_slots where slot_type = 'logical';
+-[ RECORD 1 ]--------+----------------+
| slot_name | logical_slot |
| plugin | pgoutput |
| slot_type | logical |
| datoid | 25605 |
| database | db_test |
| temporary | f |
| active | t |
| active_pid | 3841546 |
| xmin | |
| catalog_xmin | 10399 |
| restart_lsn | 0/37002410 |
| confirmed_flush_lsn | 0/37002448 |
| wal_status | reserved |
| safe_wal_size | |
| two_phase | f |
| inactive_since | |
| conflicting | f |
| invalidation_reason | |
| failover | t |
| synced | t |
+---------------------+----------------+
Logical replication is active again after the promote3. Check the slot on the new standby select * from pg_replication_slots where slot_type = 'logical'; +-[ RECORD 1 ]--------+-------------------------------+ | slot_name | logical_slot | | plugin | pgoutput | | slot_type | logical | | datoid | 25605 | | database | db_test | | temporary | f | | active | f | | active_pid | | | xmin | | | catalog_xmin | 10397 | | restart_lsn | 0/3638F5F0 | | confirmed_flush_lsn | 0/3638F6A0 | | wal_status | reserved | | safe_wal_size | | | two_phase | f | | inactive_since | 2025-08-05 10:21:03.342587+02 | | conflicting | f | | invalidation_reason | | | failover | t | | synced | f | +---------------------+---------------------------The synced flag keep value false.
Following error in in the log
2025-06-10 16:40:58.996 CEST [739829]: [1-1] user=,db=,client=,application= LOG: slot sync worker started
2025-06-10 16:40:59.011 CEST [739829]: [2-1] user=,db=,client=,application= ERROR: exiting from slot synchronization because same name slot "logical_slot" already exists on the standbyI would like to make a proposal to address the issue: Since the logical slot is in a failover state on both the primary and the standby, an attempt could be made to resynchronize them. I modify the slotsync.c module +++ b/src/backend/replication/logical/slotsync.c @@ -649,24 +649,46 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)return false; } - - /* Search for the named slot */ + // Both local and remote slot have the same name if ((slot = SearchNamedReplicationSlot(remote_slot->name, true))) { bool synced; + bool failover_status = remote_slot->failover;SpinLockAcquire(&slot->mutex); synced = slot->data.synced; SpinLockRelease(&slot->mutex); + + if (!synced){ + + Assert(!MyReplicationSlot); + + if (failover_status){ + + ReplicationSlotAcquire(remote_slot->name, true, true); + + // Put the synced flag to true to attempt resynchronizing failover slot on the standby + MyReplicationSlot->data.synced = true; + + ReplicationSlotMarkDirty();- /* User-created slot with the same name exists, raise ERROR. */ - if (!synced) - ereport(ERROR, + ReplicationSlotRelease(); + + /* Get rid of a replication slot that is no longer wanted */ + ereport(WARNING, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("slot \"%s\" local slot has the same name as remote slot and they are in failover mode, try to synchronize them", + remote_slot->name)); + return false; /* Going back to the main loop after droping the failover slot */ + } + else + /* User-created slot with the same name exists, raise ERROR. */ + ereport(ERROR, errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), errmsg("exiting from slot synchronization because same" - " name slot \"%s\" already exists on the standby", - remote_slot->name)); - + " name slot \"%s\" already exists on the standby", + remote_slot->name)); + } /* * The slot has been synchronized before. * This message follows the discussions started in this thread: https://www.postgresql.org/message-id/CAA5-nLDvnqGtBsKu4T_s-cS%2BdGbpSLEzRwgep1XfYzGhQ4o65A%40mail.gmail.comHelp would be appreciated to move this point forward
Thank You for starting a new thread and working on it.
I think you have given reference to the wrong thread, the correct one
is [1]/messages/by-id/CAA5-nLD0vKn6T1-OHROBNfN2Pxa17zVo4UoVBdfHn2y=7nKixA@mail.gmail.com.
IIUC, the proposed fix is checking if remote_slot is failover-enabled
and the slot with the same name exists locally but has 'synced'=false,
enable the 'synced' flag and proceed with synchronization from next
cycle onward, else error out. But remote-slot's failover will always
be true if we have reached this stage. Did you actually mean to check
if the local slot is failover-enabled but has 'synced' set to false
(indicating it’s a new standby after a switchover)? Even with that
check, it might not be the correct thing to overwrite such a slot
internally. I think in [1]/messages/by-id/CAA5-nLD0vKn6T1-OHROBNfN2Pxa17zVo4UoVBdfHn2y=7nKixA@mail.gmail.com, we discussed a couple of ideas related to
a GUC, alter-api, drop_all_slots API, but I don't see any of those
proposed here. Do we want to try any of that?
[1]: /messages/by-id/CAA5-nLD0vKn6T1-OHROBNfN2Pxa17zVo4UoVBdfHn2y=7nKixA@mail.gmail.com
/messages/by-id/CAA5-nLD0vKn6T1-OHROBNfN2Pxa17zVo4UoVBdfHn2y=7nKixA@mail.gmail.com
thanks
Shveta
Thanks Shveta for coming on this point again and fixing the link.
The idea is to check if the slot has same name to try to resynchronize it
with the primary.
ok the check on the failover status for the remote slot is perhaps
redundant.
I'm not sure what impact setting the synced flag to true might have. But if
you run an additional switchover, it works fine because the synced flag on
the new primary is set to true now.
If we come back to the idea of the GUC or the API, adding an
allow_overwrite parameter to the pg_create_logical_replication_slot
function and removing the logical slot when set to true could be a suitable
approach.
What is your opinion?
Regards,
Fabrice
On Fri, Aug 8, 2025 at 7:39 AM shveta malik <shveta.malik@gmail.com> wrote:
Show quoted text
On Thu, Aug 7, 2025 at 6:50 PM Fabrice Chapuis <fabrice636861@gmail.com>
wrote:Hi,
An issue occurred during the initial switchover using PostgreSQL version
17.5. The setup consists of a cluster with two nodes, managed by Patroni
version 4.0.5.Logical replication is configured on the same instance, and the new
feature enabling logical replication slots to be failover-safe in a highly
available environment is used. Logical slot management is currently
disabled in Patroni.Following are some screen captured during the swichover
1. Run the switchover with Patroni
patronictl switchover
Current cluster topology
+ Cluster: ClusterX (7529893278186104053) ----+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+----------+--------------+---------+-----------+----+-----------+
| node_1 | xxxxxxxxxxxx | Leader | running | 4 | |
| node_2 | xxxxxxxxxxxx | Replica | streaming | 4 | 0 |
+----------+--------------+---------+-----------+----+-----------+
2. Check the slot on the new Primary
select * from pg_replication_slots where slot_type = 'logical';
+-[ RECORD 1 ]--------+----------------+
| slot_name | logical_slot |
| plugin | pgoutput |
| slot_type | logical |
| datoid | 25605 |
| database | db_test |
| temporary | f |
| active | t |
| active_pid | 3841546 |
| xmin | |
| catalog_xmin | 10399 |
| restart_lsn | 0/37002410 |
| confirmed_flush_lsn | 0/37002448 |
| wal_status | reserved |
| safe_wal_size | |
| two_phase | f |
| inactive_since | |
| conflicting | f |
| invalidation_reason | |
| failover | t |
| synced | t |
+---------------------+----------------+
Logical replication is active again after the promote3. Check the slot on the new standby select * from pg_replication_slots where slot_type = 'logical'; +-[ RECORD 1 ]--------+-------------------------------+ | slot_name | logical_slot | | plugin | pgoutput | | slot_type | logical | | datoid | 25605 | | database | db_test | | temporary | f | | active | f | | active_pid | | | xmin | | | catalog_xmin | 10397 | | restart_lsn | 0/3638F5F0 | | confirmed_flush_lsn | 0/3638F6A0 | | wal_status | reserved | | safe_wal_size | | | two_phase | f | | inactive_since | 2025-08-05 10:21:03.342587+02 | | conflicting | f | | invalidation_reason | | | failover | t | | synced | f | +---------------------+---------------------------The synced flag keep value false.
Following error in in the log
2025-06-10 16:40:58.996 CEST [739829]: [1-1]user=,db=,client=,application= LOG: slot sync worker started
2025-06-10 16:40:59.011 CEST [739829]: [2-1]
user=,db=,client=,application= ERROR: exiting from slot synchronization
because same name slot "logical_slot" already exists on the standbyI would like to make a proposal to address the issue:
Since the logical slot is in a failover state on both the primary andthe standby, an attempt could be made to resynchronize them.
I modify the slotsync.c module +++ b/src/backend/replication/logical/slotsync.c @@ -649,24 +649,46 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oidremote_dbid)
return false; } - - /* Search for the named slot */ + // Both local and remote slot have the same name if ((slot = SearchNamedReplicationSlot(remote_slot->name, true))) { bool synced; + bool failover_status = remote_slot->failover;SpinLockAcquire(&slot->mutex); synced = slot->data.synced; SpinLockRelease(&slot->mutex); + + if (!synced){ + + Assert(!MyReplicationSlot); + + if (failover_status){ + +ReplicationSlotAcquire(remote_slot->name, true, true);
+ + // Put the synced flag to true toattempt resynchronizing failover slot on the standby
+ MyReplicationSlot->data.synced = true; + + ReplicationSlotMarkDirty();- /* User-created slot with the same name exists, raise
ERROR. */
- if (!synced) - ereport(ERROR, + ReplicationSlotRelease(); + + /* Get rid of a replication slot that isno longer wanted */
+ ereport(WARNING, +errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("slot \"%s\" local slot
has the same name as remote slot and they are in failover mode, try to
synchronize them",+ remote_slot->name)); + return false; /* Going back to the mainloop after droping the failover slot */
+ } + else + /* User-created slot with the same nameexists, raise ERROR. */
+ ereport(ERROR,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("exiting from slot
synchronization because same"
- " name slot \"%s\"
already exists on the standby",
- remote_slot->name)); - + " name slot\"%s\" already exists on the standby",
+
remote_slot->name));
+ }
/*
* The slot has been synchronized before.
*
This message follows the discussions started in this thread:/messages/by-id/CAA5-nLDvnqGtBsKu4T_s-cS+dGbpSLEzRwgep1XfYzGhQ4o65A@mail.gmail.com
Help would be appreciated to move this point forward
Thank You for starting a new thread and working on it.
I think you have given reference to the wrong thread, the correct one
is [1].IIUC, the proposed fix is checking if remote_slot is failover-enabled
and the slot with the same name exists locally but has 'synced'=false,
enable the 'synced' flag and proceed with synchronization from next
cycle onward, else error out. But remote-slot's failover will always
be true if we have reached this stage. Did you actually mean to check
if the local slot is failover-enabled but has 'synced' set to false
(indicating it’s a new standby after a switchover)? Even with that
check, it might not be the correct thing to overwrite such a slot
internally. I think in [1], we discussed a couple of ideas related to
a GUC, alter-api, drop_all_slots API, but I don't see any of those
proposed here. Do we want to try any of that?[1]:
/messages/by-id/CAA5-nLD0vKn6T1-OHROBNfN2Pxa17zVo4UoVBdfHn2y=7nKixA@mail.gmail.com
thanks
Shveta
On Fri, Aug 8, 2025 at 7:01 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:
Thanks Shveta for coming on this point again and fixing the link.
The idea is to check if the slot has same name to try to resynchronize it with the primary.
ok the check on the failover status for the remote slot is perhaps redundant.
I'm not sure what impact setting the synced flag to true might have. But if you run an additional switchover, it works fine because the synced flag on the new primary is set to true now.
If we come back to the idea of the GUC or the API, adding an allow_overwrite parameter to the pg_create_logical_replication_slot function and removing the logical slot when set to true could be a suitable approach.What is your opinion?
If implemented as a GUC, it would address only a specific corner case,
making it less suitable to be added as a GUC.
OTOH, adding it as a slot's property makes more sense. You can start
with introducing a new slot property, allow_overwrite. By default,
this property will be set to false.
a) The function pg_create_logical_replication_slot() can be extended
to accept this parameter.
b) A new API pg_alter_logical_replication_slot() can be introduced, to
modify this property after slot creation if needed.
c) The commands CREATE SUBSCRIPTION and ALTER SUBSCRIPTION are not
needed to include an allow_overwrite parameter. When CREATE
SUBSCRIPTION creates a slot, it will always set allow_overwrite to
false by default. If users need to change this later, they can use the
new API pg_alter_logical_replication_slot() to update the property.
d) Additionally, pg_alter_logical_replication_slot() can serve as a
generic API to modify other slot properties as well.
This appears to be a reasonable idea with potential use cases beyond
just allowing synchronization post switchover. Thoughts?
~~~
Another problem as you pointed out is inconsistent behaviour across
switchovers. On the first switchover, we get the error on new standby:
"Exiting from slot synchronization because a slot with the same name
already exists on the standby."
But in the case of a double switchover, this error does not occur.
This is due to the 'synced' flag not set on new standby on first
switchover while set in double switchover. I think the behaviour
should be the same. In both cases, it should emit the same error. We
are thinking of a potential solution here and will start a new thread
if needed.
thanks
Shveta
Thanks for sharing this. In fact, I agree, introducing an allow_overwrite
slot property makes seems cleaner than a GUC for this specific use case.
a) At first, an extension of pg_create_logical_replication_slot() could be
proposed
b) pg_alter_logical_replication_slot(): that could be is a neat approach, a
generic API for modifying other slot properties seems like a
forward-looking design choice. However I don't know if slot properties
could be modified easily because when slot is active, a process is holding
them in memory.
c) CREATE SUBSCRIPTION and ALTER SUBSCRIPTION stay untouched
Regarding the switchover inconsistency, I agree that behavior should be
uniform. The current discrepancy between the first and double switchover is
confusing, and having the same error in both cases would improve
predictability. It's up to you if you want to open a thread for this.
Regards,
Fabrice
On Wed, Aug 13, 2025 at 8:04 AM shveta malik <shveta.malik@gmail.com> wrote:
Show quoted text
On Fri, Aug 8, 2025 at 7:01 PM Fabrice Chapuis <fabrice636861@gmail.com>
wrote:Thanks Shveta for coming on this point again and fixing the link.
The idea is to check if the slot has same name to try to resynchronizeit with the primary.
ok the check on the failover status for the remote slot is perhaps
redundant.
I'm not sure what impact setting the synced flag to true might have. But
if you run an additional switchover, it works fine because the synced flag
on the new primary is set to true now.If we come back to the idea of the GUC or the API, adding an
allow_overwrite parameter to the pg_create_logical_replication_slot
function and removing the logical slot when set to true could be a suitable
approach.What is your opinion?
If implemented as a GUC, it would address only a specific corner case,
making it less suitable to be added as a GUC.OTOH, adding it as a slot's property makes more sense. You can start
with introducing a new slot property, allow_overwrite. By default,
this property will be set to false.a) The function pg_create_logical_replication_slot() can be extended
to accept this parameter.
b) A new API pg_alter_logical_replication_slot() can be introduced, to
modify this property after slot creation if needed.
c) The commands CREATE SUBSCRIPTION and ALTER SUBSCRIPTION are not
needed to include an allow_overwrite parameter. When CREATE
SUBSCRIPTION creates a slot, it will always set allow_overwrite to
false by default. If users need to change this later, they can use the
new API pg_alter_logical_replication_slot() to update the property.
d) Additionally, pg_alter_logical_replication_slot() can serve as a
generic API to modify other slot properties as well.This appears to be a reasonable idea with potential use cases beyond
just allowing synchronization post switchover. Thoughts?~~~
Another problem as you pointed out is inconsistent behaviour across
switchovers. On the first switchover, we get the error on new standby:
"Exiting from slot synchronization because a slot with the same name
already exists on the standby."But in the case of a double switchover, this error does not occur.
This is due to the 'synced' flag not set on new standby on first
switchover while set in double switchover. I think the behaviour
should be the same. In both cases, it should emit the same error. We
are thinking of a potential solution here and will start a new thread
if needed.thanks
Shveta
On Wed, Aug 13, 2025 at 9:29 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:
Thanks for sharing this. In fact, I agree, introducing an allow_overwrite slot property makes seems cleaner than a GUC for this specific use case.
a) At first, an extension of pg_create_logical_replication_slot() could be proposed
b) pg_alter_logical_replication_slot(): that could be is a neat approach, a generic API for modifying other slot properties seems like a forward-looking design choice. However I don't know if slot properties could be modified easily because when slot is active, a process is holding them in memory.
Right, it can not be.This API can be used when the slot is not
acquired by any other process. We may have use-cases for that as well
similar to one we have for this switchover case.
thanks
Shveta
I'll work on pg_create_logical_replication_slot() API. pg_replication_slots
view must also be adjusted to display the new flag allow_overwrite.
Regards
Fabrice
On Thu, Aug 14, 2025 at 8:07 AM shveta malik <shveta.malik@gmail.com> wrote:
Show quoted text
On Wed, Aug 13, 2025 at 9:29 PM Fabrice Chapuis <fabrice636861@gmail.com>
wrote:Thanks for sharing this. In fact, I agree, introducing an
allow_overwrite slot property makes seems cleaner than a GUC for this
specific use case.a) At first, an extension of pg_create_logical_replication_slot() could
be proposed
b) pg_alter_logical_replication_slot(): that could be is a neat
approach, a generic API for modifying other slot properties seems like a
forward-looking design choice. However I don't know if slot properties
could be modified easily because when slot is active, a process is holding
them in memory.Right, it can not be.This API can be used when the slot is not
acquired by any other process. We may have use-cases for that as well
similar to one we have for this switchover case.thanks
Shveta
For the first step (a), the pg_create_logical_replication_slot interface is
extended.
The slot on the new attached standby will be dropped and recreated if the
flag allow_overwrite is set to true.
I tested the modified source, could you please give me a feedback on code
changes.
Regards,
Fabrice
diff --git a/src/backend/catalog/system_functions.sql
b/src/backend/catalog/system_functions.sql
index 566f308..6cd3175 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -480,6 +480,7 @@ CREATE OR REPLACE FUNCTION
pg_create_logical_replication_slot(
IN temporary boolean DEFAULT false,
IN twophase boolean DEFAULT false,
IN failover boolean DEFAULT false,
+ IN allow_overwrite boolean DEFAULT false,
OUT slot_name name, OUT lsn pg_lsn)
RETURNS RECORD
LANGUAGE INTERNAL
diff --git a/src/backend/replication/logical/slotsync.c
b/src/backend/replication/logical/slotsync.c
index 656e66e..d6332cd 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -627,6 +627,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
remote_dbid)
ReplicationSlot *slot;
XLogRecPtr latestFlushPtr;
bool slot_updated = false;
+ bool allow_overwrite = false;
/*
* Make sure that concerned WAL is received and flushed before
syncing
@@ -649,24 +650,46 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
remote_dbid)
return false;
}
-
- /* Search for the named slot */
+ // Both local and remote slot have the same name
if ((slot = SearchNamedReplicationSlot(remote_slot->name, true)))
{
- bool synced;
+ bool synced;
SpinLockAcquire(&slot->mutex);
synced = slot->data.synced;
+ allow_overwrite = slot->data.allow_overwrite;
SpinLockRelease(&slot->mutex);
-
- /* User-created slot with the same name exists, raise
ERROR. */
- if (!synced)
- ereport(ERROR,
+
+ if (!synced){
+ /*
+ * Check if we need to overwrite an existing
+ * logical slot
+ */
+ if (allow_overwrite){
+ /*
+ * Get rid of a replication slot that is no
+ *longer wanted
+ */
+ ReplicationSlotDrop(remote_slot->name,true);
+
+ /* Get rid of a replication slot that is no
longer wanted */
+ ereport(WARNING,
+
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("slot \"%s\" already exists"
+ " on the standby but it will be dropped
because flag allow_overwrite is set to true",
+ remote_slot->name));
+
+ /* Going back to the main loop after
droping the failover slot */
+ return false;
+ }
+ else
+ /* User-created slot with the same name
exists, raise ERROR. */
+ ereport(ERROR,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("exiting from slot
synchronization because same"
- " name slot \"%s\"
already exists on the standby",
- remote_slot->name));
-
+ " name slot \"%s\"
already exists on the standby",
+ remote_slot->name));
+ }
/*
* The slot has been synchronized before.
*
@@ -761,6 +784,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
remote_dbid)
ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
remote_slot->two_phase,
remote_slot->failover,
+ allow_overwrite,
true);
/* For shorter lines. */
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 600b87f..d6bc5c6 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -323,7 +323,7 @@ ReplicationSlotValidateName(const char *name, int
elevel)
void
ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency
persistency,
- bool two_phase, bool failover,
bool synced)
+ bool two_phase, bool failover,
bool allow_overwrite, bool synced)
{
ReplicationSlot *slot = NULL;
int i;
@@ -413,6 +413,11 @@ ReplicationSlotCreate(const char *name, bool
db_specific,
slot->data.two_phase_at = InvalidXLogRecPtr;
slot->data.failover = failover;
slot->data.synced = synced;
+ slot->data.allow_overwrite = allow_overwrite;
+
+ elog(LOG, "Logical replication slot %s created with option
allow_overwrite to %s",
+ NameStr(slot->data.name),
+ slot->data.allow_overwrite ? "true" : "false");
/* and then data only present in shared memory */
slot->just_dirtied = false;
diff --git a/src/backend/replication/slotfuncs.c
b/src/backend/replication/slotfuncs.c
index 36cc2ed..6bd430f 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -40,7 +40,7 @@ create_physical_replication_slot(char *name, bool
immediately_reserve,
/* acquire replication slot, this will check for conflicting names
*/
ReplicationSlotCreate(name, false,
- temporary ? RS_TEMPORARY
: RS_PERSISTENT, false,
+ temporary ? RS_TEMPORARY
: RS_PERSISTENT, false, false,
false, false);
if (immediately_reserve)
@@ -116,7 +116,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
static void
create_logical_replication_slot(char *name, char *plugin,
bool
temporary, bool two_phase,
- bool
failover,
+ bool
failover, bool allow_overwrite,
XLogRecPtr
restart_lsn,
bool
find_startpoint)
{
@@ -134,7 +134,7 @@ create_logical_replication_slot(char *name, char
*plugin,
*/
ReplicationSlotCreate(name, true,
temporary ? RS_TEMPORARY
: RS_EPHEMERAL, two_phase,
- failover, false);
+ failover,
allow_overwrite, false);
/*
* Create logical decoding context to find start point or, if we
don't
@@ -173,6 +173,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
bool temporary = PG_GETARG_BOOL(2);
bool two_phase = PG_GETARG_BOOL(3);
bool failover = PG_GETARG_BOOL(4);
+ bool allow_overwrite = PG_GETARG_BOOL(5);
Datum result;
TupleDesc tupdesc;
HeapTuple tuple;
@@ -191,6 +192,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
temporary,
two_phase,
failover,
+
allow_overwrite,
InvalidXLogRecPtr,
true);
@@ -210,6 +212,47 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(result);
}
+/*
+ * This function is intended to modify a logical replication slot with
+ * given arguments.
+ */
+static void
+alter_logical_replication_slot(char *name, bool two_phase,
+ bool
failover,
+ bool
allow_overwrite)
+{
+ Assert(!MyReplicationSlot);
+
+ ReplicationSlotAcquire(name, true, true);
+ MyReplicationSlot->data.allow_overwrite = allow_overwrite;
+ ReplicationSlotMarkDirty();
+
+ ReplicationSlotRelease();
+}
+
+/*
+ * SQL function for altering logical replication slot properties.
+ */
+Datum
+pg_alter_logical_replication_slot(PG_FUNCTION_ARGS)
+{
+ Name name = PG_GETARG_NAME(0);
+ bool two_phase = PG_GETARG_BOOL(1);
+ bool failover = PG_GETARG_BOOL(2);
+ bool allow_overwrite = PG_GETARG_BOOL(3);
+
+ CheckSlotPermissions();
+
+ CheckLogicalDecodingRequirements();
+
+ alter_logical_replication_slot(NameStr(*name),
+
two_phase,
+
failover,
+
allow_overwrite);
+
+ PG_RETURN_NAME(name);
+}
+
/*
* SQL function for dropping a replication slot.
@@ -726,6 +769,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool
logical_slot)
temporary,
false,
false,
+
false,
src_restart_lsn,
false);
}
diff --git a/src/backend/replication/walsender.c
b/src/backend/replication/walsender.c
index 9fa8beb..ef22695 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1198,7 +1198,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
{
ReplicationSlotCreate(cmd->slotname, false,
cmd->temporary ?
RS_TEMPORARY : RS_PERSISTENT,
- false, false,
false);
+ false, false,
false, false);
if (reserve_wal)
{
@@ -1229,7 +1229,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
*/
ReplicationSlotCreate(cmd->slotname, true,
cmd->temporary ?
RS_TEMPORARY : RS_EPHEMERAL,
- two_phase,
failover, false);
+ two_phase,
failover, false, false);
/*
* Do options check early so that we can bail before
calling the
diff --git a/src/include/catalog/pg_proc.dat
b/src/include/catalog/pg_proc.dat
index 62beb71..074805d 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11480,10 +11480,10 @@
{ oid => '3786', descr => 'set up a logical replication slot',
proname => 'pg_create_logical_replication_slot', provolatile => 'v',
proparallel => 'u', prorettype => 'record',
- proargtypes => 'name name bool bool bool',
- proallargtypes => '{name,name,bool,bool,bool,name,pg_lsn}',
- proargmodes => '{i,i,i,i,i,o,o}',
- proargnames =>
'{slot_name,plugin,temporary,twophase,failover,slot_name,lsn}',
+ proargtypes => 'name name bool bool bool bool',
+ proallargtypes => '{name,name,bool,bool,bool,bool,name,pg_lsn}',
+ proargmodes => '{i,i,i,i,i,i,o,o}',
+ proargnames =>
'{slot_name,plugin,temporary,twophase,failover,allow_overwrite,slot_name,lsn}',
prosrc => 'pg_create_logical_replication_slot' },
{ oid => '4222',
descr => 'copy a logical replication slot, changing temporality and
plugin',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index eb0b93b..1fd6445 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -134,6 +134,11 @@ typedef struct ReplicationSlotPersistentData
* for logical slots on the primary server.
*/
bool failover;
+ /*
+ * Allow Postgres to drop logical replication slot on standby
server to ensure
+ * creation of new failover slot when sync_replication_slots is
true.
+ */
+ bool allow_overwrite;
} ReplicationSlotPersistentData;
/*
@@ -267,7 +272,7 @@ extern void ReplicationSlotsShmemInit(void);
/* management of individual slots */
extern void ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
- bool
two_phase, bool failover,
+ bool
two_phase, bool failover, bool allow_overwrite,
bool
synced);
extern void ReplicationSlotPersist(void);
extern void ReplicationSlotDrop(const char *name, bool nowait);
On Wed, Aug 13, 2025 at 8:04 AM shveta malik <shveta.malik@gmail.com> wrote:
Show quoted text
On Fri, Aug 8, 2025 at 7:01 PM Fabrice Chapuis <fabrice636861@gmail.com>
wrote:Thanks Shveta for coming on this point again and fixing the link.
The idea is to check if the slot has same name to try to resynchronizeit with the primary.
ok the check on the failover status for the remote slot is perhaps
redundant.
I'm not sure what impact setting the synced flag to true might have. But
if you run an additional switchover, it works fine because the synced flag
on the new primary is set to true now.If we come back to the idea of the GUC or the API, adding an
allow_overwrite parameter to the pg_create_logical_replication_slot
function and removing the logical slot when set to true could be a suitable
approach.What is your opinion?
If implemented as a GUC, it would address only a specific corner case,
making it less suitable to be added as a GUC.OTOH, adding it as a slot's property makes more sense. You can start
with introducing a new slot property, allow_overwrite. By default,
this property will be set to false.a) The function pg_create_logical_replication_slot() can be extended
to accept this parameter.
b) A new API pg_alter_logical_replication_slot() can be introduced, to
modify this property after slot creation if needed.
c) The commands CREATE SUBSCRIPTION and ALTER SUBSCRIPTION are not
needed to include an allow_overwrite parameter. When CREATE
SUBSCRIPTION creates a slot, it will always set allow_overwrite to
false by default. If users need to change this later, they can use the
new API pg_alter_logical_replication_slot() to update the property.
d) Additionally, pg_alter_logical_replication_slot() can serve as a
generic API to modify other slot properties as well.This appears to be a reasonable idea with potential use cases beyond
just allowing synchronization post switchover. Thoughts?~~~
Another problem as you pointed out is inconsistent behaviour across
switchovers. On the first switchover, we get the error on new standby:
"Exiting from slot synchronization because a slot with the same name
already exists on the standby."But in the case of a double switchover, this error does not occur.
This is due to the 'synced' flag not set on new standby on first
switchover while set in double switchover. I think the behaviour
should be the same. In both cases, it should emit the same error. We
are thinking of a potential solution here and will start a new thread
if needed.thanks
Shveta
On Wed, Aug 27, 2025 at 7:55 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:
For the first step (a), the pg_create_logical_replication_slot interface is extended.
The slot on the new attached standby will be dropped and recreated if the flag allow_overwrite is set to true.
I tested the modified source, could you please give me a feedback on code changes.
Thanks for working on this. Can you please provide a patch instead of
diff? It will be easier for reviewers to apply and test it then.
thanks
Shveta
What is the procedure to create this patch. Thank you for your help.
On Thu, Aug 28, 2025 at 12:04 PM shveta malik <shveta.malik@gmail.com>
wrote:
Show quoted text
On Wed, Aug 27, 2025 at 7:55 PM Fabrice Chapuis <fabrice636861@gmail.com>
wrote:For the first step (a), the pg_create_logical_replication_slot interface
is extended.
The slot on the new attached standby will be dropped and recreated if
the flag allow_overwrite is set to true.
I tested the modified source, could you please give me a feedback on
code changes.
Thanks for working on this. Can you please provide a patch instead of
diff? It will be easier for reviewers to apply and test it then.thanks
Shveta
On Thu, Aug 28, 2025 at 9:11 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:
What is the procedure to create this patch. Thank you for your help.
We use 'git format-patch' to generate formatted patches. I have given
a few links ([1]https://stackoverflow.com/questions/6658313/how-can-i-generate-a-git-patch-for-a-specific-commit,[2]https://www.geeksforgeeks.org/git/how-to-generate-a-git-patch-for-a-specific-commit/,[3)] on know-how.
I usually use these steps:
git add <file1.c> <adds file to staging area>
git add <file2.c>
<once all the required files are added, create a commit>
git commit <add the required message to describe the patch>
git format-patch -1 -v1 <create the patch>
-1-> to create patch for last 1 commit
-v1 --> patch-prefix. I have used v1 which indicates version-1, when
we upload the patch a second time, we can use -v2 and so on.
[1]: https://stackoverflow.com/questions/6658313/how-can-i-generate-a-git-patch-for-a-specific-commit
[2]: https://www.geeksforgeeks.org/git/how-to-generate-a-git-patch-for-a-specific-commit/
[3]: https://gist.github.com/ganapathichidambaram/3504e31fc8ba809762b3a0d9d9ef8ec2
thanks
Shveta
Thanks Shveta for your procedure to follow, here is the version 1 of the
patch
Regards,
Fabrice
On Fri, Aug 29, 2025 at 5:34 AM shveta malik <shveta.malik@gmail.com> wrote:
Show quoted text
On Thu, Aug 28, 2025 at 9:11 PM Fabrice Chapuis <fabrice636861@gmail.com>
wrote:What is the procedure to create this patch. Thank you for your help.
We use 'git format-patch' to generate formatted patches. I have given
a few links ([1],[2],[3)] on know-how.I usually use these steps:
git add <file1.c> <adds file to staging area>
git add <file2.c>
<once all the required files are added, create a commit>
git commit <add the required message to describe the patch>
git format-patch -1 -v1 <create the patch>
-1-> to create patch for last 1 commit
-v1 --> patch-prefix. I have used v1 which indicates version-1, when
we upload the patch a second time, we can use -v2 and so on.[1]:
https://stackoverflow.com/questions/6658313/how-can-i-generate-a-git-patch-for-a-specific-commit
[2]:
https://www.geeksforgeeks.org/git/how-to-generate-a-git-patch-for-a-specific-commit/
[3]:
https://gist.github.com/ganapathichidambaram/3504e31fc8ba809762b3a0d9d9ef8ec2thanks
Shveta
Attachments:
v1-0001-fix-failover-slot-issue-when-doing-a-switchover.patchapplication/octet-stream; name=v1-0001-fix-failover-slot-issue-when-doing-a-switchover.patchDownload
From 58403718a39b81aa107e29bd1791e158f19e5646 Mon Sep 17 00:00:00 2001
From: Fabrice Chapuis <fabrice.chapuis@zas.admin.ch>
Date: Mon, 28 Jul 2025 15:27:03 +0200
Subject: [PATCH v1] fix failover slot issue when doing a switchover
---
src/backend/catalog/system_functions.sql | 1 +
src/backend/replication/logical/slotsync.c | 44 +++++++++++++++++-----
src/backend/replication/slot.c | 7 +++-
src/backend/replication/slotfuncs.c | 10 +++--
src/backend/replication/walsender.c | 4 +-
src/include/catalog/pg_proc.dat | 8 ++--
src/include/replication/slot.h | 7 +++-
7 files changed, 59 insertions(+), 22 deletions(-)
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 566f308..6cd3175 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -480,6 +480,7 @@ CREATE OR REPLACE FUNCTION pg_create_logical_replication_slot(
IN temporary boolean DEFAULT false,
IN twophase boolean DEFAULT false,
IN failover boolean DEFAULT false,
+ IN allow_overwrite boolean DEFAULT false,
OUT slot_name name, OUT lsn pg_lsn)
RETURNS RECORD
LANGUAGE INTERNAL
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 656e66e..d6332cd 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -627,6 +627,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
ReplicationSlot *slot;
XLogRecPtr latestFlushPtr;
bool slot_updated = false;
+ bool allow_overwrite = false;
/*
* Make sure that concerned WAL is received and flushed before syncing
@@ -649,24 +650,46 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
return false;
}
-
- /* Search for the named slot */
+ // Both local and remote slot have the same name
if ((slot = SearchNamedReplicationSlot(remote_slot->name, true)))
{
- bool synced;
+ bool synced;
SpinLockAcquire(&slot->mutex);
synced = slot->data.synced;
+ allow_overwrite = slot->data.allow_overwrite;
SpinLockRelease(&slot->mutex);
-
- /* User-created slot with the same name exists, raise ERROR. */
- if (!synced)
- ereport(ERROR,
+
+ if (!synced){
+ /*
+ * Check if we need to overwrite an existing
+ * logical slot
+ */
+ if (allow_overwrite){
+ /*
+ * Get rid of a replication slot that is no
+ *longer wanted
+ */
+ ReplicationSlotDrop(remote_slot->name,true);
+
+ /* Get rid of a replication slot that is no longer wanted */
+ ereport(WARNING,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("slot \"%s\" already exists"
+ " on the standby but it will be dropped because flag allow_overwrite is set to true",
+ remote_slot->name));
+
+ /* Going back to the main loop after droping the failover slot */
+ return false;
+ }
+ else
+ /* User-created slot with the same name exists, raise ERROR. */
+ ereport(ERROR,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("exiting from slot synchronization because same"
- " name slot \"%s\" already exists on the standby",
- remote_slot->name));
-
+ " name slot \"%s\" already exists on the standby",
+ remote_slot->name));
+ }
/*
* The slot has been synchronized before.
*
@@ -761,6 +784,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
remote_slot->two_phase,
remote_slot->failover,
+ allow_overwrite,
true);
/* For shorter lines. */
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 600b87f..d6bc5c6 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -323,7 +323,7 @@ ReplicationSlotValidateName(const char *name, int elevel)
void
ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
- bool two_phase, bool failover, bool synced)
+ bool two_phase, bool failover, bool allow_overwrite, bool synced)
{
ReplicationSlot *slot = NULL;
int i;
@@ -413,6 +413,11 @@ ReplicationSlotCreate(const char *name, bool db_specific,
slot->data.two_phase_at = InvalidXLogRecPtr;
slot->data.failover = failover;
slot->data.synced = synced;
+ slot->data.allow_overwrite = allow_overwrite;
+
+ elog(LOG, "Logical replication slot %s created with option allow_overwrite to %s",
+ NameStr(slot->data.name),
+ slot->data.allow_overwrite ? "true" : "false");
/* and then data only present in shared memory */
slot->just_dirtied = false;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 36cc2ed..5dd5f50 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -40,7 +40,7 @@ create_physical_replication_slot(char *name, bool immediately_reserve,
/* acquire replication slot, this will check for conflicting names */
ReplicationSlotCreate(name, false,
- temporary ? RS_TEMPORARY : RS_PERSISTENT, false,
+ temporary ? RS_TEMPORARY : RS_PERSISTENT, false, false,
false, false);
if (immediately_reserve)
@@ -116,7 +116,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
static void
create_logical_replication_slot(char *name, char *plugin,
bool temporary, bool two_phase,
- bool failover,
+ bool failover, bool allow_overwrite,
XLogRecPtr restart_lsn,
bool find_startpoint)
{
@@ -134,7 +134,7 @@ create_logical_replication_slot(char *name, char *plugin,
*/
ReplicationSlotCreate(name, true,
temporary ? RS_TEMPORARY : RS_EPHEMERAL, two_phase,
- failover, false);
+ failover, allow_overwrite, false);
/*
* Create logical decoding context to find start point or, if we don't
@@ -173,6 +173,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
bool temporary = PG_GETARG_BOOL(2);
bool two_phase = PG_GETARG_BOOL(3);
bool failover = PG_GETARG_BOOL(4);
+ bool allow_overwrite = PG_GETARG_BOOL(5);
Datum result;
TupleDesc tupdesc;
HeapTuple tuple;
@@ -191,6 +192,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
temporary,
two_phase,
failover,
+ allow_overwrite,
InvalidXLogRecPtr,
true);
@@ -210,7 +212,6 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(result);
}
-
/*
* SQL function for dropping a replication slot.
*/
@@ -726,6 +727,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
temporary,
false,
false,
+ false,
src_restart_lsn,
false);
}
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 9fa8beb..ef22695 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1198,7 +1198,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
{
ReplicationSlotCreate(cmd->slotname, false,
cmd->temporary ? RS_TEMPORARY : RS_PERSISTENT,
- false, false, false);
+ false, false, false, false);
if (reserve_wal)
{
@@ -1229,7 +1229,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
*/
ReplicationSlotCreate(cmd->slotname, true,
cmd->temporary ? RS_TEMPORARY : RS_EPHEMERAL,
- two_phase, failover, false);
+ two_phase, failover, false, false);
/*
* Do options check early so that we can bail before calling the
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 62beb71..074805d 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11480,10 +11480,10 @@
{ oid => '3786', descr => 'set up a logical replication slot',
proname => 'pg_create_logical_replication_slot', provolatile => 'v',
proparallel => 'u', prorettype => 'record',
- proargtypes => 'name name bool bool bool',
- proallargtypes => '{name,name,bool,bool,bool,name,pg_lsn}',
- proargmodes => '{i,i,i,i,i,o,o}',
- proargnames => '{slot_name,plugin,temporary,twophase,failover,slot_name,lsn}',
+ proargtypes => 'name name bool bool bool bool',
+ proallargtypes => '{name,name,bool,bool,bool,bool,name,pg_lsn}',
+ proargmodes => '{i,i,i,i,i,i,o,o}',
+ proargnames => '{slot_name,plugin,temporary,twophase,failover,allow_overwrite,slot_name,lsn}',
prosrc => 'pg_create_logical_replication_slot' },
{ oid => '4222',
descr => 'copy a logical replication slot, changing temporality and plugin',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index eb0b93b..1fd6445 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -134,6 +134,11 @@ typedef struct ReplicationSlotPersistentData
* for logical slots on the primary server.
*/
bool failover;
+ /*
+ * Allow Postgres to drop logical replication slot on standby server to ensure
+ * creation of new failover slot when sync_replication_slots is true.
+ */
+ bool allow_overwrite;
} ReplicationSlotPersistentData;
/*
@@ -267,7 +272,7 @@ extern void ReplicationSlotsShmemInit(void);
/* management of individual slots */
extern void ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
- bool two_phase, bool failover,
+ bool two_phase, bool failover, bool allow_overwrite,
bool synced);
extern void ReplicationSlotPersist(void);
extern void ReplicationSlotDrop(const char *name, bool nowait);
--
2.47.3
On Fri, 29 Aug 2025 at 19:28, Fabrice Chapuis <fabrice636861@gmail.com> wrote:
Thanks Shveta for your procedure to follow, here is the version 1 of the patch
Regards,
Fabrice
On Fri, Aug 29, 2025 at 5:34 AM shveta malik <shveta.malik@gmail.com> wrote:
On Thu, Aug 28, 2025 at 9:11 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:
What is the procedure to create this patch. Thank you for your help.
We use 'git format-patch' to generate formatted patches. I have given
a few links ([1],[2],[3)] on know-how.I usually use these steps:
git add <file1.c> <adds file to staging area>
git add <file2.c>
<once all the required files are added, create a commit>
git commit <add the required message to describe the patch>
git format-patch -1 -v1 <create the patch>
-1-> to create patch for last 1 commit
-v1 --> patch-prefix. I have used v1 which indicates version-1, when
we upload the patch a second time, we can use -v2 and so on.[1]: https://stackoverflow.com/questions/6658313/how-can-i-generate-a-git-patch-for-a-specific-commit
[2]: https://www.geeksforgeeks.org/git/how-to-generate-a-git-patch-for-a-specific-commit/
[3]: https://gist.github.com/ganapathichidambaram/3504e31fc8ba809762b3a0d9d9ef8ec2thanks
Shveta
Hi Fabrice,
Thanks for providing the patch. I reviewed your patch and have
following comment:
1. I think we should add a commit message in the patch. It would help
to give an understanding of the patch.
2. I tried applying patch on HEAD, it generates following warnings:
Applying: fix failover slot issue when doing a switchover
.git/rebase-apply/patch:31: trailing whitespace.
bool allow_overwrite = false;
.git/rebase-apply/patch:45: trailing whitespace.
bool synced;
.git/rebase-apply/patch:55: trailing whitespace.
.git/rebase-apply/patch:56: trailing whitespace.
if (!synced){
.git/rebase-apply/patch:57: trailing whitespace.
/*
warning: squelched 16 whitespace errors
warning: 21 lines add whitespace errors.
3. I also tried to build the patch on HEAD and it is giving the following error:
ubuntu@ubuntu-Virtual-Machine:~/Project/pg/postgres$
./../../install-clean.sh pg_30_8 > warn.log
launcher.c: In function ‘CreateConflictDetectionSlot’:
launcher.c:1457:9: error: too few arguments to function ‘ReplicationSlotCreate’
1457 | ReplicationSlotCreate(CONFLICT_DETECTION_SLOT, false,
RS_PERSISTENT, false,
| ^~~~~~~~~~~~~~~~~~~~~
In file included from launcher.c:35:
../../../../src/include/replication/slot.h:307:13: note: declared here
307 | extern void ReplicationSlotCreate(const char *name, bool db_specific,
| ^~~~~~~~~~~~~~~~~~~~~
make[4]: *** [<builtin>: launcher.o] Error 1
make[3]: *** [../../../src/backend/common.mk:37: logical-recursive] Error 2
make[2]https://wiki.postgresql.org/wiki/Creating_Clean_Patches: *** [common.mk:37: replication-recursive] Error 2
make[1]https://www.postgresql.org/docs/devel/source-format.html: *** [Makefile:42: install-backend-recurse] Error 2
make: *** [GNUmakefile:11: install-src-recurse] Error 2
4. I have some general comments regarding formatting of the patch.
+ // Both local and remote slot have the same name
We use following format for single line comments:
/* comment text */
and for multi line comments we use following format:
/*
* comment text begins here
* and continues here
*/
5. We should use proper indentation here:
+ elog(LOG, "Logical replication slot %s created with option
allow_overwrite to %s",
+ NameStr(slot->data.name),
+ slot->data.allow_overwrite ? "true" : "false");
6.
+ if (!synced){
+ /*
+ * Check if we need to overwrite an existing
+ * logical slot
+ */
We should start the parentheses from the next line.
Also, indentation for comment is not correct here.
7.
+ if (allow_overwrite){
+ /*
+ * Get rid of a replication slot that is no
+ *longer wanted
+ */
Similar comment as above.
Please refer [1]https://www.postgresql.org/docs/devel/source-format.html [2]https://wiki.postgresql.org/wiki/Creating_Clean_Patches for proper formatting of the patch.
[1]: https://www.postgresql.org/docs/devel/source-format.html
[2]: https://wiki.postgresql.org/wiki/Creating_Clean_Patches
Thanks,
Shlok Kyal
On Sat, Aug 30, 2025 at 11:43 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
Hi Fabrice,
Thanks for providing the patch. I reviewed your patch and have
following comment:1. I think we should add a commit message in the patch. It would help
to give an understanding of the patch.2. I tried applying patch on HEAD, it generates following warnings:
Applying: fix failover slot issue when doing a switchover
.git/rebase-apply/patch:31: trailing whitespace.
bool allow_overwrite = false;
.git/rebase-apply/patch:45: trailing whitespace.
bool synced;
.git/rebase-apply/patch:55: trailing whitespace..git/rebase-apply/patch:56: trailing whitespace.
if (!synced){
.git/rebase-apply/patch:57: trailing whitespace.
/*
warning: squelched 16 whitespace errors
warning: 21 lines add whitespace errors.
'git diff --check' can be used before 'git commit' to get info of
above issues before creating a patch.
3. I also tried to build the patch on HEAD and it is giving the following error:
ubuntu@ubuntu-Virtual-Machine:~/Project/pg/postgres$
./../../install-clean.sh pg_30_8 > warn.log
launcher.c: In function ‘CreateConflictDetectionSlot’:
launcher.c:1457:9: error: too few arguments to function ‘ReplicationSlotCreate’
1457 | ReplicationSlotCreate(CONFLICT_DETECTION_SLOT, false,
RS_PERSISTENT, false,
| ^~~~~~~~~~~~~~~~~~~~~
In file included from launcher.c:35:
../../../../src/include/replication/slot.h:307:13: note: declared here
307 | extern void ReplicationSlotCreate(const char *name, bool db_specific,
| ^~~~~~~~~~~~~~~~~~~~~
make[4]: *** [<builtin>: launcher.o] Error 1
make[3]: *** [../../../src/backend/common.mk:37: logical-recursive] Error 2
make[2]: *** [common.mk:37: replication-recursive] Error 2
make[1]: *** [Makefile:42: install-backend-recurse] Error 2
make: *** [GNUmakefile:11: install-src-recurse] Error 24. I have some general comments regarding formatting of the patch.
+ // Both local and remote slot have the same nameWe use following format for single line comments:
/* comment text */
and for multi line comments we use following format:
/*
* comment text begins here
* and continues here
*/5. We should use proper indentation here: + elog(LOG, "Logical replication slot %s created with option allow_overwrite to %s", + NameStr(slot->data.name), + slot->data.allow_overwrite ? "true" : "false");
src/tools/pgindent <file name>
can be used to do indentation before creating a patch.
Show quoted text
6. + if (!synced){ + /* + * Check if we need to overwrite an existing + * logical slot + */ We should start the parentheses from the next line. Also, indentation for comment is not correct here.7. + if (allow_overwrite){ + /* + * Get rid of a replication slot that is no + *longer wanted + */ Similar comment as above.Please refer [1] [2] for proper formatting of the patch.
[1]: https://www.postgresql.org/docs/devel/source-format.html
[2]: https://wiki.postgresql.org/wiki/Creating_Clean_PatchesThanks,
Shlok Kyal
Hi,
I will make de modifications based on the remarks you mentioned.
Regards,
Fabrice
On Mon, Sep 1, 2025 at 5:45 AM shveta malik <shveta.malik@gmail.com> wrote:
Show quoted text
On Sat, Aug 30, 2025 at 11:43 AM Shlok Kyal <shlok.kyal.oss@gmail.com>
wrote:Hi Fabrice,
Thanks for providing the patch. I reviewed your patch and have
following comment:1. I think we should add a commit message in the patch. It would help
to give an understanding of the patch.2. I tried applying patch on HEAD, it generates following warnings:
Applying: fix failover slot issue when doing a switchover
.git/rebase-apply/patch:31: trailing whitespace.
bool allow_overwrite = false;
.git/rebase-apply/patch:45: trailing whitespace.
bool synced;
.git/rebase-apply/patch:55: trailing whitespace..git/rebase-apply/patch:56: trailing whitespace.
if (!synced){
.git/rebase-apply/patch:57: trailing whitespace.
/*
warning: squelched 16 whitespace errors
warning: 21 lines add whitespace errors.'git diff --check' can be used before 'git commit' to get info of
above issues before creating a patch.3. I also tried to build the patch on HEAD and it is giving the
following error:
ubuntu@ubuntu-Virtual-Machine:~/Project/pg/postgres$
./../../install-clean.sh pg_30_8 > warn.log
launcher.c: In function ‘CreateConflictDetectionSlot’:
launcher.c:1457:9: error: too few arguments to function‘ReplicationSlotCreate’
1457 | ReplicationSlotCreate(CONFLICT_DETECTION_SLOT, false,
RS_PERSISTENT, false,
| ^~~~~~~~~~~~~~~~~~~~~
In file included from launcher.c:35:
../../../../src/include/replication/slot.h:307:13: note: declared here
307 | extern void ReplicationSlotCreate(const char *name, booldb_specific,
| ^~~~~~~~~~~~~~~~~~~~~
make[4]: *** [<builtin>: launcher.o] Error 1
make[3]: *** [../../../src/backend/common.mk:37: logical-recursive]Error 2
make[2]: *** [common.mk:37: replication-recursive] Error 2
make[1]: *** [Makefile:42: install-backend-recurse] Error 2
make: *** [GNUmakefile:11: install-src-recurse] Error 24. I have some general comments regarding formatting of the patch.
+ // Both local and remote slot have the same nameWe use following format for single line comments:
/* comment text */
and for multi line comments we use following format:
/*
* comment text begins here
* and continues here
*/5. We should use proper indentation here: + elog(LOG, "Logical replication slot %s created with option allow_overwrite to %s", + NameStr(slot->data.name), + slot->data.allow_overwrite ? "true" : "false");src/tools/pgindent <file name>
can be used to do indentation before creating a patch.6. + if (!synced){ + /* + * Check if we need to overwrite an existing + * logical slot + */ We should start the parentheses from the next line. Also, indentation for comment is not correct here.7. + if (allow_overwrite){ + /* + * Get rid of a replication slot that is no + *longer wanted + */ Similar comment as above.Please refer [1] [2] for proper formatting of the patch.
[1]: https://www.postgresql.org/docs/devel/source-format.html
[2]: https://wiki.postgresql.org/wiki/Creating_Clean_PatchesThanks,
Shlok Kyal
Hi,
Here the version 1 of the patch with the modifications.
I do a git diff --check until there are no more errors.
In a next version of the patch, I think I have make change to call
ReplicationSlotCreate() function with the value of the flag
allow_overwrite be consistent and modify the
interface ReplicationSlotCreate to display the attribute allow_overwrite.
Best regards,
Fabrice
On Mon, Sep 1, 2025 at 9:42 AM Fabrice Chapuis <fabrice636861@gmail.com>
wrote:
Show quoted text
Hi,
I will make de modifications based on the remarks you mentioned.Regards,
Fabrice
On Mon, Sep 1, 2025 at 5:45 AM shveta malik <shveta.malik@gmail.com>
wrote:On Sat, Aug 30, 2025 at 11:43 AM Shlok Kyal <shlok.kyal.oss@gmail.com>
wrote:Hi Fabrice,
Thanks for providing the patch. I reviewed your patch and have
following comment:1. I think we should add a commit message in the patch. It would help
to give an understanding of the patch.2. I tried applying patch on HEAD, it generates following warnings:
Applying: fix failover slot issue when doing a switchover
.git/rebase-apply/patch:31: trailing whitespace.
bool allow_overwrite = false;
.git/rebase-apply/patch:45: trailing whitespace.
bool synced;
.git/rebase-apply/patch:55: trailing whitespace..git/rebase-apply/patch:56: trailing whitespace.
if (!synced){
.git/rebase-apply/patch:57: trailing whitespace.
/*
warning: squelched 16 whitespace errors
warning: 21 lines add whitespace errors.'git diff --check' can be used before 'git commit' to get info of
above issues before creating a patch.3. I also tried to build the patch on HEAD and it is giving the
following error:
ubuntu@ubuntu-Virtual-Machine:~/Project/pg/postgres$
./../../install-clean.sh pg_30_8 > warn.log
launcher.c: In function ‘CreateConflictDetectionSlot’:
launcher.c:1457:9: error: too few arguments to function‘ReplicationSlotCreate’
1457 | ReplicationSlotCreate(CONFLICT_DETECTION_SLOT, false,
RS_PERSISTENT, false,
| ^~~~~~~~~~~~~~~~~~~~~
In file included from launcher.c:35:
../../../../src/include/replication/slot.h:307:13: note: declared here
307 | extern void ReplicationSlotCreate(const char *name, booldb_specific,
| ^~~~~~~~~~~~~~~~~~~~~
make[4]: *** [<builtin>: launcher.o] Error 1
make[3]: *** [../../../src/backend/common.mk:37: logical-recursive]Error 2
make[2]: *** [common.mk:37: replication-recursive] Error 2
make[1]: *** [Makefile:42: install-backend-recurse] Error 2
make: *** [GNUmakefile:11: install-src-recurse] Error 24. I have some general comments regarding formatting of the patch.
+ // Both local and remote slot have the same nameWe use following format for single line comments:
/* comment text */
and for multi line comments we use following format:
/*
* comment text begins here
* and continues here
*/5. We should use proper indentation here: + elog(LOG, "Logical replication slot %s created with option allow_overwrite to %s", + NameStr(slot->data.name), + slot->data.allow_overwrite ? "true" : "false");src/tools/pgindent <file name>
can be used to do indentation before creating a patch.6. + if (!synced){ + /* + * Check if we need to overwrite an existing + * logical slot + */ We should start the parentheses from the next line. Also, indentation for comment is not correct here.7. + if (allow_overwrite){ + /* + * Get rid of a replication slot that is no + *longer wanted + */ Similar comment as above.Please refer [1] [2] for proper formatting of the patch.
[1]: https://www.postgresql.org/docs/devel/source-format.html
[2]: https://wiki.postgresql.org/wiki/Creating_Clean_PatchesThanks,
Shlok Kyal
Attachments:
v1-0001-fix-failover-slot-issue-when-doing-a-switchover.patchapplication/octet-stream; name=v1-0001-fix-failover-slot-issue-when-doing-a-switchover.patchDownload
From e12031e0bdf098811d2fc99962a04c01a0ee77f9 Mon Sep 17 00:00:00 2001
From: Fabrice Chapuis <fabrice.chapuis@zas.admin.ch>
Date: Sun, 7 Sep 2025 09:37:24 +0200
Subject: [PATCH v1] Fix-failover-slot-issue-when-doing-a-switchover.
Add allow_overwrite parameter to pg_create_logical_replication_slot.
If allow_overwrite is set to true then existing failover slot on standby
will be dropped, then ensure main loop continues after drop and create a
new slot.
---
src/backend/catalog/system_functions.sql | 1 +
src/backend/replication/logical/launcher.c | 2 +-
src/backend/replication/logical/slotsync.c | 25 +++++++++++++++++++++-
src/backend/replication/slot.c | 6 +++++-
src/backend/replication/slotfuncs.c | 9 +++++---
src/backend/replication/walsender.c | 4 ++--
src/include/catalog/pg_proc.dat | 8 +++----
src/include/replication/slot.h | 8 ++++++-
8 files changed, 50 insertions(+), 13 deletions(-)
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 566f308..6cd3175 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -480,6 +480,7 @@ CREATE OR REPLACE FUNCTION pg_create_logical_replication_slot(
IN temporary boolean DEFAULT false,
IN twophase boolean DEFAULT false,
IN failover boolean DEFAULT false,
+ IN allow_overwrite boolean DEFAULT false,
OUT slot_name name, OUT lsn pg_lsn)
RETURNS RECORD
LANGUAGE INTERNAL
diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 37377f7..840741e 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -1455,7 +1455,7 @@ CreateConflictDetectionSlot(void)
errmsg("creating replication conflict detection slot"));
ReplicationSlotCreate(CONFLICT_DETECTION_SLOT, false, RS_PERSISTENT, false,
- false, false);
+ false, false,false);
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index caa003c..2a8692e 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -626,6 +626,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
ReplicationSlot *slot;
XLogRecPtr latestFlushPtr;
bool slot_updated = false;
+ bool allow_overwrite = false;
/*
* Make sure that concerned WAL is received and flushed before syncing
@@ -656,15 +657,36 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
SpinLockAcquire(&slot->mutex);
synced = slot->data.synced;
+ allow_overwrite = slot->data.allow_overwrite;
SpinLockRelease(&slot->mutex);
/* User-created slot with the same name exists, raise ERROR. */
if (!synced)
- ereport(ERROR,
+ {
+ /* Check if we need to overwrite an existing logical slot */
+ if (allow_overwrite)
+ {
+ /* Get rid of a replication slot that is no longer wanted */
+ ReplicationSlotDrop(remote_slot->name,true);
+
+ /* Get rid of a replication slot that is no longer wanted */
+ ereport(WARNING,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("slot \"%s\" already exists"
+ " on the standby but it will be dropped because "
+ "flag allow_overwrite is set to true",
+ remote_slot->name));
+
+ /* Going back to the main loop after droping the failover slot */
+ return false;
+ }
+ else
+ ereport(ERROR,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("exiting from slot synchronization because same"
" name slot \"%s\" already exists on the standby",
remote_slot->name));
+ }
/*
* The slot has been synchronized before.
@@ -760,6 +782,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
remote_slot->two_phase,
remote_slot->failover,
+ allow_overwrite,
true);
/* For shorter lines. */
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fd0fdb9..9a85fb0 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -351,7 +351,7 @@ IsSlotForConflictCheck(const char *name)
void
ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
- bool two_phase, bool failover, bool synced)
+ bool two_phase, bool failover, bool allow_overwrite, bool synced)
{
ReplicationSlot *slot = NULL;
int i;
@@ -445,6 +445,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
slot->data.two_phase = two_phase;
slot->data.two_phase_at = InvalidXLogRecPtr;
slot->data.failover = failover;
+ slot->data.allow_overwrite = allow_overwrite;
slot->data.synced = synced;
/* and then data only present in shared memory */
@@ -500,6 +501,9 @@ ReplicationSlotCreate(const char *name, bool db_specific,
*/
LWLockRelease(ReplicationSlotAllocationLock);
+ elog(DEBUG1, "logical replication slot %s created with option allow_overwrite to %s",
+ NameStr(slot->data.name), slot->data.allow_overwrite ? "true" : "false");
+
/* Let everybody know we've modified this slot */
ConditionVariableBroadcast(&slot->active_cv);
}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 69f4c61..3f53502 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -41,7 +41,7 @@ create_physical_replication_slot(char *name, bool immediately_reserve,
/* acquire replication slot, this will check for conflicting names */
ReplicationSlotCreate(name, false,
temporary ? RS_TEMPORARY : RS_PERSISTENT, false,
- false, false);
+ false, false,false);
if (immediately_reserve)
{
@@ -116,7 +116,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
static void
create_logical_replication_slot(char *name, char *plugin,
bool temporary, bool two_phase,
- bool failover,
+ bool failover, bool allow_overwrite,
XLogRecPtr restart_lsn,
bool find_startpoint)
{
@@ -134,7 +134,7 @@ create_logical_replication_slot(char *name, char *plugin,
*/
ReplicationSlotCreate(name, true,
temporary ? RS_TEMPORARY : RS_EPHEMERAL, two_phase,
- failover, false);
+ failover, allow_overwrite, false);
/*
* Create logical decoding context to find start point or, if we don't
@@ -173,6 +173,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
bool temporary = PG_GETARG_BOOL(2);
bool two_phase = PG_GETARG_BOOL(3);
bool failover = PG_GETARG_BOOL(4);
+ bool allow_overwrite = PG_GETARG_BOOL(5);
Datum result;
TupleDesc tupdesc;
HeapTuple tuple;
@@ -191,6 +192,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
temporary,
two_phase,
failover,
+ allow_overwrite,
InvalidXLogRecPtr,
true);
@@ -726,6 +728,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
temporary,
false,
false,
+ false,
src_restart_lsn,
false);
}
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index e3dce9d..a2db1e3 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1201,7 +1201,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
{
ReplicationSlotCreate(cmd->slotname, false,
cmd->temporary ? RS_TEMPORARY : RS_PERSISTENT,
- false, false, false);
+ false, false, false, false);
if (reserve_wal)
{
@@ -1232,7 +1232,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
*/
ReplicationSlotCreate(cmd->slotname, true,
cmd->temporary ? RS_TEMPORARY : RS_EPHEMERAL,
- two_phase, failover, false);
+ two_phase, failover, false, false);
/*
* Do options check early so that we can bail before calling the
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 118d6da..ee55572 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11498,10 +11498,10 @@
{ oid => '3786', descr => 'set up a logical replication slot',
proname => 'pg_create_logical_replication_slot', provolatile => 'v',
proparallel => 'u', prorettype => 'record',
- proargtypes => 'name name bool bool bool',
- proallargtypes => '{name,name,bool,bool,bool,name,pg_lsn}',
- proargmodes => '{i,i,i,i,i,o,o}',
- proargnames => '{slot_name,plugin,temporary,twophase,failover,slot_name,lsn}',
+ proargtypes => 'name name bool bool bool bool',
+ proallargtypes => '{name,name,bool,bool,bool,bool,name,pg_lsn}',
+ proargmodes => '{i,i,i,i,i,i,o,o}',
+ proargnames => '{slot_name,plugin,temporary,twophase,failover,allow_overwrite,slot_name,lsn}',
prosrc => 'pg_create_logical_replication_slot' },
{ oid => '4222',
descr => 'copy a logical replication slot, changing temporality and plugin',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index e8fc342..eef5c86 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -141,6 +141,12 @@ typedef struct ReplicationSlotPersistentData
* for logical slots on the primary server.
*/
bool failover;
+
+ /*
+ * Allow Postgres to drop logical replication slot on standby server to ensure
+ * creation of new failover slot when remote sync_replication_slots is set to true.
+ */
+ bool allow_overwrite;
} ReplicationSlotPersistentData;
/*
@@ -301,7 +307,7 @@ extern void ReplicationSlotsShmemInit(void);
/* management of individual slots */
extern void ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
- bool two_phase, bool failover,
+ bool two_phase, bool failover, bool allow_overwrite,
bool synced);
extern void ReplicationSlotPersist(void);
extern void ReplicationSlotDrop(const char *name, bool nowait);
--
2.47.3
On Wed, Aug 13, 2025 at 11:34 AM shveta malik <shveta.malik@gmail.com> wrote:
On Fri, Aug 8, 2025 at 7:01 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:
Another problem as you pointed out is inconsistent behaviour across
switchovers. On the first switchover, we get the error on new standby:
"Exiting from slot synchronization because a slot with the same name
already exists on the standby."But in the case of a double switchover, this error does not occur.
This is due to the 'synced' flag not set on new standby on first
switchover while set in double switchover. I think the behaviour
should be the same. In both cases, it should emit the same error. We
are thinking of a potential solution here and will start a new thread
if needed.
Started a thread [1]/messages/by-id/CAJpy0uBJqTQdq+UDW55oK2bhxMig=FaEYFV=+ZaKmtmgNWwsHw@mail.gmail.com for this issue.
[1]: /messages/by-id/CAJpy0uBJqTQdq+UDW55oK2bhxMig=FaEYFV=+ZaKmtmgNWwsHw@mail.gmail.com
thanks
Shveta
On Sun, Sep 7, 2025 at 1:32 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:
Hi,
Here the version 1 of the patch with the modifications.
I do a git diff --check until there are no more errors.
In a next version of the patch, I think I have make change to call ReplicationSlotCreate() function with the value of the flag allow_overwrite be consistent
In which flow? I see all ReplicationSlotCreate() references already
accepting the value.
and modify the interface ReplicationSlotCreate to display the attribute allow_overwrite.
Do you mean to modify pg_replication_slots?
Thank You for the patch. Please find a few comments:
1)
In both create_physical_replication_slot() and
create_physical_replication_slot():
+ false, false,false);
,false --> , false (space after comma is recommended)
2)
+ elog(DEBUG1, "logical replication slot %s created with option
allow_overwrite to %s",
+ NameStr(slot->data.name), slot->data.allow_overwrite ? "true" : "false");
IMO, we don't need this as we do not log other properties of the slot as well.
3)
We can have pg_replication_slots (pg_get_replication_slots) modified
to display the new property. Also we can modify docs to have the new
property defined.
4)
+ {
+ /* Check if we need to overwrite an existing logical slot */
+ if (allow_overwrite)
+ {
+ /* Get rid of a replication slot that is no longer wanted */
+ ReplicationSlotDrop(remote_slot->name,true);
+
+ /* Get rid of a replication slot that is no longer wanted */
+ ereport(WARNING,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("slot \"%s\" already exists"
+ " on the standby but it will be dropped because "
+ "flag allow_overwrite is set to true",
+ remote_slot->name));
+
+ /* Going back to the main loop after droping the failover slot */
+ return false;
Currently we are returning after dropping the slot. But I think we
shall drop the slot and proceed with new slot creation of the same
name otherwise we may be left with old slot dropped and no new slot
created specially when API is run.
Scenario:
--On primary, create 2 slots: slot1 and slot2.
--On standby, create one slot slot1 with allow_overwrite=true.
--Run 'SELECT pg_sync_replication_slots();' on standby.
At the end of API, expectation is that both slots will be present with
'synced'=true, but only slot2 is present. if we run this API next
time, slot1 is created. It should have been dropped and recreated (or
say overwritten) in the first run itself.
5)
IMO, LOG is sufficient here, as the action aligns with the
user-provided 'allow_overwrite' setting.
thanks
Shveta
Thanks Shveta for all your comments. I'll prepare a version 2 of the patch
including your proposals.
Regards
Fabrice
On Thu, 11 Sep 2025, 07:42 shveta malik <shveta.malik@gmail.com> wrote:
Show quoted text
On Sun, Sep 7, 2025 at 1:32 PM Fabrice Chapuis <fabrice636861@gmail.com>
wrote:Hi,
Here the version 1 of the patch with the modifications.
I do a git diff --check until there are no more errors.
In a next version of the patch, I think I have make change to callReplicationSlotCreate() function with the value of the flag
allow_overwrite be consistentIn which flow? I see all ReplicationSlotCreate() references already
accepting the value.and modify the interface ReplicationSlotCreate to display the attribute
allow_overwrite.
Do you mean to modify pg_replication_slots?
Thank You for the patch. Please find a few comments:
1)
In both create_physical_replication_slot() and
create_physical_replication_slot():
+ false, false,false);,false --> , false (space after comma is recommended)
2) + elog(DEBUG1, "logical replication slot %s created with option allow_overwrite to %s", + NameStr(slot->data.name), slot->data.allow_overwrite ? "true" : "false");IMO, we don't need this as we do not log other properties of the slot as
well.3)
We can have pg_replication_slots (pg_get_replication_slots) modified
to display the new property. Also we can modify docs to have the new
property defined.4) + { + /* Check if we need to overwrite an existing logical slot */ + if (allow_overwrite) + { + /* Get rid of a replication slot that is no longer wanted */ + ReplicationSlotDrop(remote_slot->name,true); + + /* Get rid of a replication slot that is no longer wanted */ + ereport(WARNING, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("slot \"%s\" already exists" + " on the standby but it will be dropped because " + "flag allow_overwrite is set to true", + remote_slot->name)); + + /* Going back to the main loop after droping the failover slot */ + return false;Currently we are returning after dropping the slot. But I think we
shall drop the slot and proceed with new slot creation of the same
name otherwise we may be left with old slot dropped and no new slot
created specially when API is run.Scenario:
--On primary, create 2 slots: slot1 and slot2.
--On standby, create one slot slot1 with allow_overwrite=true.
--Run 'SELECT pg_sync_replication_slots();' on standby.At the end of API, expectation is that both slots will be present with
'synced'=true, but only slot2 is present. if we run this API next
time, slot1 is created. It should have been dropped and recreated (or
say overwritten) in the first run itself.5)
IMO, LOG is sufficient here, as the action aligns with the
user-provided 'allow_overwrite' setting.thanks
Shveta
Hi Shveta,
Here is the version 1 of the patch with corrections
1)
In both create_physical_replication_slot() and
create_physical_replication_slot():
+ false, false,false);
,false --> , false (space after comma is recommended)
2)
+ elog(DEBUG1, "logical replication slot %s created with option
allow_overwrite to %s",
+ NameStr(slot->data.name), slot->data.allow_overwrite ? "true" : "false");
IMO, we don't need this as we do not log other properties of the slot as
well.
Point 1-2 done
3)
We can have pg_replication_slots (pg_get_replication_slots) modified
to display the new property. Also we can modify docs to have the new
property defined.
Point 3 Not yet implemented
4)
+ {
+ /* Check if we need to overwrite an existing logical slot */
+ if (allow_overwrite)
+ {
+ /* Get rid of a replication slot that is no longer wanted */
+ ReplicationSlotDrop(remote_slot->name,true);
+
+ /* Get rid of a replication slot that is no longer wanted */
+ ereport(WARNING,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("slot \"%s\" already exists"
+ " on the standby but it will be dropped because "
+ "flag allow_overwrite is set to true",
+ remote_slot->name));
+
+ /* Going back to the main loop after droping the failover slot */
+ return false;
Currently we are returning after dropping the slot. But I think we
shall drop the slot and proceed with new slot creation of the same
name otherwise we may be left with old slot dropped and no new slot
created specially when API is run.
Scenario:
--On primary, create 2 slots: slot1 and slot2.
--On standby, create one slot slot1 with allow_overwrite=true.
--Run 'SELECT pg_sync_replication_slots();' on standby.
At the end of API, expectation is that both slots will be present with
'synced'=true, but only slot2 is present. if we run this API next
time, slot1 is created. It should have been dropped and recreated (or
say overwritten) in the first run itself.
Point 4:
I put the creation slot under the allow_overwrite condition.
After testing with syn_standby_slots disable on the standby, it works. I
think the code for the synchronisation of the new slot could be factorised.
5)
IMO, LOG is sufficient here, as the action aligns with the
user-provided 'allow_overwrite' setting.
Point 5
Where is it in the code?
Best regards,
Fabrice
On Fri, Sep 12, 2025 at 2:44 PM Fabrice Chapuis <fabrice636861@gmail.com>
wrote:
Show quoted text
Thanks Shveta for all your comments. I'll prepare a version 2 of the patch
including your proposals.Regards
Fabrice
On Thu, 11 Sep 2025, 07:42 shveta malik <shveta.malik@gmail.com> wrote:
On Sun, Sep 7, 2025 at 1:32 PM Fabrice Chapuis <fabrice636861@gmail.com>
wrote:Hi,
Here the version 1 of the patch with the modifications.
I do a git diff --check until there are no more errors.
In a next version of the patch, I think I have make change to callReplicationSlotCreate() function with the value of the flag
allow_overwrite be consistentIn which flow? I see all ReplicationSlotCreate() references already
accepting the value.and modify the interface ReplicationSlotCreate to display the attribute
allow_overwrite.
Do you mean to modify pg_replication_slots?
Thank You for the patch. Please find a few comments:
1)
In both create_physical_replication_slot() and
create_physical_replication_slot():
+ false, false,false);,false --> , false (space after comma is recommended)
2) + elog(DEBUG1, "logical replication slot %s created with option allow_overwrite to %s", + NameStr(slot->data.name), slot->data.allow_overwrite ? "true" : "false");IMO, we don't need this as we do not log other properties of the slot as
well.3)
We can have pg_replication_slots (pg_get_replication_slots) modified
to display the new property. Also we can modify docs to have the new
property defined.4) + { + /* Check if we need to overwrite an existing logical slot */ + if (allow_overwrite) + { + /* Get rid of a replication slot that is no longer wanted */ + ReplicationSlotDrop(remote_slot->name,true); + + /* Get rid of a replication slot that is no longer wanted */ + ereport(WARNING, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("slot \"%s\" already exists" + " on the standby but it will be dropped because " + "flag allow_overwrite is set to true", + remote_slot->name)); + + /* Going back to the main loop after droping the failover slot */ + return false;Currently we are returning after dropping the slot. But I think we
shall drop the slot and proceed with new slot creation of the same
name otherwise we may be left with old slot dropped and no new slot
created specially when API is run.Scenario:
--On primary, create 2 slots: slot1 and slot2.
--On standby, create one slot slot1 with allow_overwrite=true.
--Run 'SELECT pg_sync_replication_slots();' on standby.At the end of API, expectation is that both slots will be present with
'synced'=true, but only slot2 is present. if we run this API next
time, slot1 is created. It should have been dropped and recreated (or
say overwritten) in the first run itself.5)
IMO, LOG is sufficient here, as the action aligns with the
user-provided 'allow_overwrite' setting.thanks
Shveta
Attachments:
v1-0001-Add-allow_overwrite-option-to-logical-replication.patchapplication/octet-stream; name=v1-0001-Add-allow_overwrite-option-to-logical-replication.patchDownload
From 5b13c0061fd354c9e77991ab401a1788464e7b64 Mon Sep 17 00:00:00 2001
From: Fabrice Chapuis <fabrice636861@gmail.com>
Date: Fri, 26 Sep 2025 17:52:39 +0200
Subject: [PATCH v1] Add allow_overwrite option to logical replication slot
creation
This patch adds a new parameter allow_overwrite to
pg_create_logical_replication_slot. When true, an existing
failover slot on standby will be dropped and replaced
automatically.
Signed-off-by: Fabrice Chapuis <fabrice636861@gmail.com>
---
src/backend/catalog/system_functions.sql | 1 +
src/backend/replication/logical/launcher.c | 2 +-
src/backend/replication/logical/slotsync.c | 205 ++++++++++++++-------
src/backend/replication/slot.c | 3 +-
src/backend/replication/slotfuncs.c | 9 +-
src/backend/replication/walsender.c | 4 +-
src/include/catalog/pg_proc.dat | 8 +-
src/include/replication/slot.h | 8 +-
8 files changed, 160 insertions(+), 80 deletions(-)
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 566f308..6cd3175 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -480,6 +480,7 @@ CREATE OR REPLACE FUNCTION pg_create_logical_replication_slot(
IN temporary boolean DEFAULT false,
IN twophase boolean DEFAULT false,
IN failover boolean DEFAULT false,
+ IN allow_overwrite boolean DEFAULT false,
OUT slot_name name, OUT lsn pg_lsn)
RETURNS RECORD
LANGUAGE INTERNAL
diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 37377f7..b7bef13 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -1455,7 +1455,7 @@ CreateConflictDetectionSlot(void)
errmsg("creating replication conflict detection slot"));
ReplicationSlotCreate(CONFLICT_DETECTION_SLOT, false, RS_PERSISTENT, false,
- false, false);
+ false, false, false);
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index caa003c..6933e8d 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -653,92 +653,160 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
if ((slot = SearchNamedReplicationSlot(remote_slot->name, true)))
{
bool synced;
+ bool allow_overwrite;
SpinLockAcquire(&slot->mutex);
synced = slot->data.synced;
+ allow_overwrite = slot->data.allow_overwrite;
SpinLockRelease(&slot->mutex);
/* User-created slot with the same name exists, raise ERROR. */
if (!synced)
- ereport(ERROR,
+ {
+ /* Check if we need to overwrite an existing logical slot */
+ if (allow_overwrite)
+ {
+ /* Recreate the failover slot and synchronize it */
+ NameData plugin_name;
+ TransactionId xmin_horizon = InvalidTransactionId;
+
+ ereport(WARNING,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("slot \"%s\" already exists"
+ " on the standby but try to recreate it because "
+ "flag allow_overwrite is set to true",
+ remote_slot->name));
+
+ /* Get rid of a replication slot that is no longer wanted */
+ ReplicationSlotDrop(remote_slot->name,true);
+
+ /* Skip creating the local slot if remote_slot is invalidated already */
+ if (remote_slot->invalidated != RS_INVAL_NONE)
+ return false;
+
+ /*
+ * We create temporary slots instead of ephemeral slots here because
+ * we want the slots to survive after releasing them. This is done to
+ * avoid dropping and re-creating the slots in each synchronization
+ * cycle if the restart_lsn or catalog_xmin of the remote slot has not
+ * caught up.
+ */
+ ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
+ remote_slot->two_phase,
+ remote_slot->failover,
+ true,
+ true);
+
+ /* For shorter lines. */
+ slot = MyReplicationSlot;
+
+ /* Avoid expensive operations while holding a spinlock. */
+ namestrcpy(&plugin_name, remote_slot->plugin);
+
+ SpinLockAcquire(&slot->mutex);
+ slot->data.database = remote_dbid;
+ slot->data.plugin = plugin_name;
+ SpinLockRelease(&slot->mutex);
+
+ reserve_wal_for_local_slot(remote_slot->restart_lsn);
+
+ LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+ xmin_horizon = GetOldestSafeDecodingTransactionId(true);
+ SpinLockAcquire(&slot->mutex);
+ slot->effective_catalog_xmin = xmin_horizon;
+ slot->data.catalog_xmin = xmin_horizon;
+ SpinLockRelease(&slot->mutex);
+ ReplicationSlotsComputeRequiredXmin(true);
+ LWLockRelease(ProcArrayLock);
+
+ update_and_persist_local_synced_slot(remote_slot, remote_dbid);
+
+ slot_updated = true;
+ }
+ else
+ ereport(ERROR,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("exiting from slot synchronization because same"
" name slot \"%s\" already exists on the standby",
remote_slot->name));
-
- /*
- * The slot has been synchronized before.
- *
- * It is important to acquire the slot here before checking
- * invalidation. If we don't acquire the slot first, there could be a
- * race condition that the local slot could be invalidated just after
- * checking the 'invalidated' flag here and we could end up
- * overwriting 'invalidated' flag to remote_slot's value. See
- * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
- * if the slot is not acquired by other processes.
- *
- * XXX: If it ever turns out that slot acquire/release is costly for
- * cases when none of the slot properties is changed then we can do a
- * pre-check to ensure that at least one of the slot properties is
- * changed before acquiring the slot.
- */
- ReplicationSlotAcquire(remote_slot->name, true, false);
-
- Assert(slot == MyReplicationSlot);
-
- /*
- * Copy the invalidation cause from remote only if local slot is not
- * invalidated locally, we don't want to overwrite existing one.
- */
- if (slot->data.invalidated == RS_INVAL_NONE &&
- remote_slot->invalidated != RS_INVAL_NONE)
+ }
+ else
{
- SpinLockAcquire(&slot->mutex);
- slot->data.invalidated = remote_slot->invalidated;
- SpinLockRelease(&slot->mutex);
+
+ /*
+ * The slot has been synchronized before.
+ *
+ * It is important to acquire the slot here before checking
+ * invalidation. If we don't acquire the slot first, there could be a
+ * race condition that the local slot could be invalidated just after
+ * checking the 'invalidated' flag here and we could end up
+ * overwriting 'invalidated' flag to remote_slot's value. See
+ * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
+ * if the slot is not acquired by other processes.
+ *
+ * XXX: If it ever turns out that slot acquire/release is costly for
+ * cases when none of the slot properties is changed then we can do a
+ * pre-check to ensure that at least one of the slot properties is
+ * changed before acquiring the slot.
+ */
+ ReplicationSlotAcquire(remote_slot->name, true, false);
+
+ Assert(slot == MyReplicationSlot);
- /* Make sure the invalidated state persists across server restart */
- ReplicationSlotMarkDirty();
- ReplicationSlotSave();
+ /*
+ * Copy the invalidation cause from remote only if local slot is not
+ * invalidated locally, we don't want to overwrite existing one.
+ */
+ if (slot->data.invalidated == RS_INVAL_NONE &&
+ remote_slot->invalidated != RS_INVAL_NONE)
+ {
+ SpinLockAcquire(&slot->mutex);
+ slot->data.invalidated = remote_slot->invalidated;
+ SpinLockRelease(&slot->mutex);
- slot_updated = true;
- }
+ /* Make sure the invalidated state persists across server restart */
+ ReplicationSlotMarkDirty();
+ ReplicationSlotSave();
- /* Skip the sync of an invalidated slot */
- if (slot->data.invalidated != RS_INVAL_NONE)
- {
- ReplicationSlotRelease();
- return slot_updated;
- }
+ slot_updated = true;
+ }
- /* Slot not ready yet, let's attempt to make it sync-ready now. */
- if (slot->data.persistency == RS_TEMPORARY)
- {
- slot_updated = update_and_persist_local_synced_slot(remote_slot,
- remote_dbid);
- }
+ /* Skip the sync of an invalidated slot */
+ if (slot->data.invalidated != RS_INVAL_NONE)
+ {
+ ReplicationSlotRelease();
+ return slot_updated;
+ }
- /* Slot ready for sync, so sync it. */
- else
- {
- /*
- * Sanity check: As long as the invalidations are handled
- * appropriately as above, this should never happen.
- *
- * We don't need to check restart_lsn here. See the comments in
- * update_local_synced_slot() for details.
- */
- if (remote_slot->confirmed_lsn < slot->data.confirmed_flush)
- ereport(ERROR,
- errmsg_internal("cannot synchronize local slot \"%s\"",
- remote_slot->name),
- errdetail_internal("Local slot's start streaming location LSN(%X/%08X) is ahead of remote slot's LSN(%X/%08X).",
- LSN_FORMAT_ARGS(slot->data.confirmed_flush),
- LSN_FORMAT_ARGS(remote_slot->confirmed_lsn)));
+ /* Slot not ready yet, let's attempt to make it sync-ready now. */
+ if (slot->data.persistency == RS_TEMPORARY)
+ {
+ slot_updated = update_and_persist_local_synced_slot(remote_slot,
+ remote_dbid);
+ }
- slot_updated = update_local_synced_slot(remote_slot, remote_dbid,
- NULL, NULL);
- }
+ /* Slot ready for sync, so sync it. */
+ else
+ {
+ /*
+ * Sanity check: As long as the invalidations are handled
+ * appropriately as above, this should never happen.
+ *
+ * We don't need to check restart_lsn here. See the comments in
+ * update_local_synced_slot() for details.
+ */
+ if (remote_slot->confirmed_lsn < slot->data.confirmed_flush)
+ ereport(ERROR,
+ errmsg_internal("cannot synchronize local slot \"%s\"",
+ remote_slot->name),
+ errdetail_internal("Local slot's start streaming location LSN(%X/%08X) is ahead of remote slot's LSN(%X/%08X).",
+ LSN_FORMAT_ARGS(slot->data.confirmed_flush),
+ LSN_FORMAT_ARGS(remote_slot->confirmed_lsn)));
+
+ slot_updated = update_local_synced_slot(remote_slot, remote_dbid,
+ NULL, NULL);
+ }
+ }
}
/* Otherwise create the slot first. */
else
@@ -760,6 +828,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
remote_slot->two_phase,
remote_slot->failover,
+ true,
true);
/* For shorter lines. */
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fd0fdb9..fcba4d4 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -351,7 +351,7 @@ IsSlotForConflictCheck(const char *name)
void
ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
- bool two_phase, bool failover, bool synced)
+ bool two_phase, bool failover, bool allow_overwrite, bool synced)
{
ReplicationSlot *slot = NULL;
int i;
@@ -445,6 +445,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
slot->data.two_phase = two_phase;
slot->data.two_phase_at = InvalidXLogRecPtr;
slot->data.failover = failover;
+ slot->data.allow_overwrite = allow_overwrite;
slot->data.synced = synced;
/* and then data only present in shared memory */
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 69f4c61..38f7460 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -41,7 +41,7 @@ create_physical_replication_slot(char *name, bool immediately_reserve,
/* acquire replication slot, this will check for conflicting names */
ReplicationSlotCreate(name, false,
temporary ? RS_TEMPORARY : RS_PERSISTENT, false,
- false, false);
+ false, false, false);
if (immediately_reserve)
{
@@ -116,7 +116,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
static void
create_logical_replication_slot(char *name, char *plugin,
bool temporary, bool two_phase,
- bool failover,
+ bool failover, bool allow_overwrite,
XLogRecPtr restart_lsn,
bool find_startpoint)
{
@@ -134,7 +134,7 @@ create_logical_replication_slot(char *name, char *plugin,
*/
ReplicationSlotCreate(name, true,
temporary ? RS_TEMPORARY : RS_EPHEMERAL, two_phase,
- failover, false);
+ failover, allow_overwrite, false);
/*
* Create logical decoding context to find start point or, if we don't
@@ -173,6 +173,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
bool temporary = PG_GETARG_BOOL(2);
bool two_phase = PG_GETARG_BOOL(3);
bool failover = PG_GETARG_BOOL(4);
+ bool allow_overwrite = PG_GETARG_BOOL(5);
Datum result;
TupleDesc tupdesc;
HeapTuple tuple;
@@ -191,6 +192,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
temporary,
two_phase,
failover,
+ allow_overwrite,
InvalidXLogRecPtr,
true);
@@ -726,6 +728,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
temporary,
false,
false,
+ false,
src_restart_lsn,
false);
}
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index e3dce9d..a2db1e3 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1201,7 +1201,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
{
ReplicationSlotCreate(cmd->slotname, false,
cmd->temporary ? RS_TEMPORARY : RS_PERSISTENT,
- false, false, false);
+ false, false, false, false);
if (reserve_wal)
{
@@ -1232,7 +1232,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
*/
ReplicationSlotCreate(cmd->slotname, true,
cmd->temporary ? RS_TEMPORARY : RS_EPHEMERAL,
- two_phase, failover, false);
+ two_phase, failover, false, false);
/*
* Do options check early so that we can bail before calling the
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 118d6da..ee55572 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11498,10 +11498,10 @@
{ oid => '3786', descr => 'set up a logical replication slot',
proname => 'pg_create_logical_replication_slot', provolatile => 'v',
proparallel => 'u', prorettype => 'record',
- proargtypes => 'name name bool bool bool',
- proallargtypes => '{name,name,bool,bool,bool,name,pg_lsn}',
- proargmodes => '{i,i,i,i,i,o,o}',
- proargnames => '{slot_name,plugin,temporary,twophase,failover,slot_name,lsn}',
+ proargtypes => 'name name bool bool bool bool',
+ proallargtypes => '{name,name,bool,bool,bool,bool,name,pg_lsn}',
+ proargmodes => '{i,i,i,i,i,i,o,o}',
+ proargnames => '{slot_name,plugin,temporary,twophase,failover,allow_overwrite,slot_name,lsn}',
prosrc => 'pg_create_logical_replication_slot' },
{ oid => '4222',
descr => 'copy a logical replication slot, changing temporality and plugin',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index e8fc342..c2e2871 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -141,6 +141,12 @@ typedef struct ReplicationSlotPersistentData
* for logical slots on the primary server.
*/
bool failover;
+
+ /*
+ * Allow Postgres to drop logical replication slot on standby server to ensure
+ * creation of new failover slot when remote sync_replication_slots is set to true.
+ */
+ bool allow_overwrite;
} ReplicationSlotPersistentData;
/*
@@ -301,7 +307,7 @@ extern void ReplicationSlotsShmemInit(void);
/* management of individual slots */
extern void ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
- bool two_phase, bool failover,
+ bool two_phase, bool failover, bool allow_overwrite,
bool synced);
extern void ReplicationSlotPersist(void);
extern void ReplicationSlotDrop(const char *name, bool nowait);
--
2.47.3
On Fri, Sep 26, 2025 at 9:52 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:
Hi Shveta,
Here is the version 1 of the patch with corrections
Thanks. We can bump the version to v2 now — that is, each time we post
a new version, we can increment the version number.
The patch fails to apply to the latest HEAD. Also it had some
'trailing whitespace' issues. Can you please rebase the patch and post
it again?
1)
In both create_physical_replication_slot() and
create_physical_replication_slot():
+ false, false,false);,false --> , false (space after comma is recommended)
2) + elog(DEBUG1, "logical replication slot %s created with option allow_overwrite to %s", + NameStr(slot->data.name), slot->data.allow_overwrite ? "true" : "false");IMO, we don't need this as we do not log other properties of the slot as well.
Point 1-2 done
3)
We can have pg_replication_slots (pg_get_replication_slots) modified
to display the new property. Also we can modify docs to have the new
property defined.Point 3 Not yet implemented
4) + { + /* Check if we need to overwrite an existing logical slot */ + if (allow_overwrite) + { + /* Get rid of a replication slot that is no longer wanted */ + ReplicationSlotDrop(remote_slot->name,true); + + /* Get rid of a replication slot that is no longer wanted */ + ereport(WARNING, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("slot \"%s\" already exists" + " on the standby but it will be dropped because " + "flag allow_overwrite is set to true", + remote_slot->name)); + + /* Going back to the main loop after droping the failover slot */ + return false;Currently we are returning after dropping the slot. But I think we
shall drop the slot and proceed with new slot creation of the same
name otherwise we may be left with old slot dropped and no new slot
created specially when API is run.Scenario:
--On primary, create 2 slots: slot1 and slot2.
--On standby, create one slot slot1 with allow_overwrite=true.
--Run 'SELECT pg_sync_replication_slots();' on standby.At the end of API, expectation is that both slots will be present with
'synced'=true, but only slot2 is present. if we run this API next
time, slot1 is created. It should have been dropped and recreated (or
say overwritten) in the first run itself.Point 4:
I put the creation slot under the allow_overwrite condition.
After testing with syn_standby_slots disable on the standby, it works. I think the code for the synchronisation of the new slot could be factorised.5)
IMO, LOG is sufficient here, as the action aligns with the
user-provided 'allow_overwrite' setting.Point 5
Where is it in the code?
This one:
+ ereport(WARNING,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("slot \"%s\" already exists"
+ " on the standby but try to recreate it because "
+ "flag allow_overwrite is set to true",
+ remote_slot->name));
thanks
Shveta
Hi,
Here the generated v2 of the Patch.
Thanks
Fabrice
On Mon, Sep 29, 2025 at 8:28 AM shveta malik <shveta.malik@gmail.com> wrote:
Show quoted text
On Fri, Sep 26, 2025 at 9:52 PM Fabrice Chapuis <fabrice636861@gmail.com>
wrote:Hi Shveta,
Here is the version 1 of the patch with corrections
Thanks. We can bump the version to v2 now — that is, each time we post
a new version, we can increment the version number.The patch fails to apply to the latest HEAD. Also it had some
'trailing whitespace' issues. Can you please rebase the patch and post
it again?1)
In both create_physical_replication_slot() and
create_physical_replication_slot():
+ false, false,false);,false --> , false (space after comma is recommended)
2) + elog(DEBUG1, "logical replication slot %s created with option allow_overwrite to %s", + NameStr(slot->data.name), slot->data.allow_overwrite ? "true" :"false");
IMO, we don't need this as we do not log other properties of the slot as
well.
Point 1-2 done
3)
We can have pg_replication_slots (pg_get_replication_slots) modified
to display the new property. Also we can modify docs to have the new
property defined.Point 3 Not yet implemented
4) + { + /* Check if we need to overwrite an existing logical slot */ + if (allow_overwrite) + { + /* Get rid of a replication slot that is no longer wanted */ + ReplicationSlotDrop(remote_slot->name,true); + + /* Get rid of a replication slot that is no longer wanted */ + ereport(WARNING, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("slot \"%s\" already exists" + " on the standby but it will be dropped because " + "flag allow_overwrite is set to true", + remote_slot->name)); + + /* Going back to the main loop after droping the failover slot */ + return false;Currently we are returning after dropping the slot. But I think we
shall drop the slot and proceed with new slot creation of the same
name otherwise we may be left with old slot dropped and no new slot
created specially when API is run.Scenario:
--On primary, create 2 slots: slot1 and slot2.
--On standby, create one slot slot1 with allow_overwrite=true.
--Run 'SELECT pg_sync_replication_slots();' on standby.At the end of API, expectation is that both slots will be present with
'synced'=true, but only slot2 is present. if we run this API next
time, slot1 is created. It should have been dropped and recreated (or
say overwritten) in the first run itself.Point 4:
I put the creation slot under the allow_overwrite condition.
After testing with syn_standby_slots disable on the standby, it works. Ithink the code for the synchronisation of the new slot could be factorised.
5)
IMO, LOG is sufficient here, as the action aligns with the
user-provided 'allow_overwrite' setting.Point 5
Where is it in the code?This one:
+ ereport(WARNING, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("slot \"%s\" already exists" + " on the standby but try to recreate it because " + "flag allow_overwrite is set to true", + remote_slot->name));thanks
Shveta
Attachments:
v2-0001-Add-allow_overwrite-option-to-logical-replication.patchapplication/x-patch; name=v2-0001-Add-allow_overwrite-option-to-logical-replication.patchDownload
From 2ce1d7fa50df7ed6f7370b5e725a15ad11ec3d08 Mon Sep 17 00:00:00 2001
From: Fabrice Chapuis <fabrice636861@gmail.com>
Date: Mon, 29 Sep 2025 13:10:38 +0200
Subject: [PATCH v2] Add allow_overwrite option to logical replication slot
creation
This patch adds a new parameter allow_overwrite to
pg_create_logical_replication_slot. When true, an existing
failover slot on standby will be dropped and replaced
automatically.
Signed-off-by: Fabrice Chapuis <fabrice636861@gmail.com>
---
src/backend/catalog/system_functions.sql | 1 +
src/backend/replication/logical/launcher.c | 2 +-
src/backend/replication/logical/slotsync.c | 203 ++++++++++++++-------
src/backend/replication/slot.c | 3 +-
src/backend/replication/slotfuncs.c | 9 +-
src/backend/replication/walsender.c | 4 +-
src/include/catalog/pg_proc.dat | 8 +-
src/include/replication/slot.h | 8 +-
8 files changed, 159 insertions(+), 79 deletions(-)
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 2d946d6..75ea84f 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -480,6 +480,7 @@ CREATE OR REPLACE FUNCTION pg_create_logical_replication_slot(
IN temporary boolean DEFAULT false,
IN twophase boolean DEFAULT false,
IN failover boolean DEFAULT false,
+ IN allow_overwrite boolean DEFAULT false,
OUT slot_name name, OUT lsn pg_lsn)
RETURNS RECORD
LANGUAGE INTERNAL
diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 218cefe..a26af52 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -1519,7 +1519,7 @@ CreateConflictDetectionSlot(void)
errmsg("creating replication conflict detection slot"));
ReplicationSlotCreate(CONFLICT_DETECTION_SLOT, false, RS_PERSISTENT, false,
- false, false);
+ false, false, false);
init_conflict_slot_xmin();
}
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 8c061d5..dbf9380 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -653,91 +653,159 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
if ((slot = SearchNamedReplicationSlot(remote_slot->name, true)))
{
bool synced;
+ bool allow_overwrite;
SpinLockAcquire(&slot->mutex);
synced = slot->data.synced;
+ allow_overwrite = slot->data.allow_overwrite;
SpinLockRelease(&slot->mutex);
/* User-created slot with the same name exists, raise ERROR. */
if (!synced)
- ereport(ERROR,
+ {
+ /* Check if we need to overwrite an existing logical slot */
+ if (allow_overwrite)
+ {
+ /* Recreate the failover slot and synchronize it */
+ NameData plugin_name;
+ TransactionId xmin_horizon = InvalidTransactionId;
+
+ ereport(LOG,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("slot \"%s\" already exists"
+ " on the standby but try to recreate it because "
+ "flag allow_overwrite is set to true",
+ remote_slot->name));
+
+ /* Get rid of a replication slot that is no longer wanted */
+ ReplicationSlotDrop(remote_slot->name,true);
+
+ /* Skip creating the local slot if remote_slot is invalidated already */
+ if (remote_slot->invalidated != RS_INVAL_NONE)
+ return false;
+
+ /*
+ * We create temporary slots instead of ephemeral slots here because
+ * we want the slots to survive after releasing them. This is done to
+ * avoid dropping and re-creating the slots in each synchronization
+ * cycle if the restart_lsn or catalog_xmin of the remote slot has not
+ * caught up.
+ */
+ ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
+ remote_slot->two_phase,
+ remote_slot->failover,
+ true,
+ true);
+
+ /* For shorter lines. */
+ slot = MyReplicationSlot;
+
+ /* Avoid expensive operations while holding a spinlock. */
+ namestrcpy(&plugin_name, remote_slot->plugin);
+
+ SpinLockAcquire(&slot->mutex);
+ slot->data.database = remote_dbid;
+ slot->data.plugin = plugin_name;
+ SpinLockRelease(&slot->mutex);
+
+ reserve_wal_for_local_slot(remote_slot->restart_lsn);
+
+ LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+ xmin_horizon = GetOldestSafeDecodingTransactionId(true);
+ SpinLockAcquire(&slot->mutex);
+ slot->effective_catalog_xmin = xmin_horizon;
+ slot->data.catalog_xmin = xmin_horizon;
+ SpinLockRelease(&slot->mutex);
+ ReplicationSlotsComputeRequiredXmin(true);
+ LWLockRelease(ProcArrayLock);
+
+ update_and_persist_local_synced_slot(remote_slot, remote_dbid);
+
+ slot_updated = true;
+ }
+ else
+ ereport(ERROR,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("exiting from slot synchronization because same"
" name slot \"%s\" already exists on the standby",
remote_slot->name));
-
- /*
- * The slot has been synchronized before.
- *
- * It is important to acquire the slot here before checking
- * invalidation. If we don't acquire the slot first, there could be a
- * race condition that the local slot could be invalidated just after
- * checking the 'invalidated' flag here and we could end up
- * overwriting 'invalidated' flag to remote_slot's value. See
- * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
- * if the slot is not acquired by other processes.
- *
- * XXX: If it ever turns out that slot acquire/release is costly for
- * cases when none of the slot properties is changed then we can do a
- * pre-check to ensure that at least one of the slot properties is
- * changed before acquiring the slot.
- */
- ReplicationSlotAcquire(remote_slot->name, true, false);
-
- Assert(slot == MyReplicationSlot);
-
- /*
- * Copy the invalidation cause from remote only if local slot is not
- * invalidated locally, we don't want to overwrite existing one.
- */
- if (slot->data.invalidated == RS_INVAL_NONE &&
- remote_slot->invalidated != RS_INVAL_NONE)
+ }
+ else
{
- SpinLockAcquire(&slot->mutex);
- slot->data.invalidated = remote_slot->invalidated;
- SpinLockRelease(&slot->mutex);
- /* Make sure the invalidated state persists across server restart */
- ReplicationSlotMarkDirty();
- ReplicationSlotSave();
+ /*
+ * The slot has been synchronized before.
+ *
+ * It is important to acquire the slot here before checking
+ * invalidation. If we don't acquire the slot first, there could be a
+ * race condition that the local slot could be invalidated just after
+ * checking the 'invalidated' flag here and we could end up
+ * overwriting 'invalidated' flag to remote_slot's value. See
+ * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
+ * if the slot is not acquired by other processes.
+ *
+ * XXX: If it ever turns out that slot acquire/release is costly for
+ * cases when none of the slot properties is changed then we can do a
+ * pre-check to ensure that at least one of the slot properties is
+ * changed before acquiring the slot.
+ */
+ ReplicationSlotAcquire(remote_slot->name, true, false);
+
+ Assert(slot == MyReplicationSlot);
- slot_updated = true;
- }
+ /*
+ * Copy the invalidation cause from remote only if local slot is not
+ * invalidated locally, we don't want to overwrite existing one.
+ */
+ if (slot->data.invalidated == RS_INVAL_NONE &&
+ remote_slot->invalidated != RS_INVAL_NONE)
+ {
+ SpinLockAcquire(&slot->mutex);
+ slot->data.invalidated = remote_slot->invalidated;
+ SpinLockRelease(&slot->mutex);
- /* Skip the sync of an invalidated slot */
- if (slot->data.invalidated != RS_INVAL_NONE)
- {
- ReplicationSlotRelease();
- return slot_updated;
- }
+ /* Make sure the invalidated state persists across server restart */
+ ReplicationSlotMarkDirty();
+ ReplicationSlotSave();
- /* Slot not ready yet, let's attempt to make it sync-ready now. */
- if (slot->data.persistency == RS_TEMPORARY)
- {
- slot_updated = update_and_persist_local_synced_slot(remote_slot,
- remote_dbid);
- }
+ slot_updated = true;
+ }
- /* Slot ready for sync, so sync it. */
- else
- {
- /*
- * Sanity check: As long as the invalidations are handled
- * appropriately as above, this should never happen.
- *
- * We don't need to check restart_lsn here. See the comments in
- * update_local_synced_slot() for details.
- */
- if (remote_slot->confirmed_lsn < slot->data.confirmed_flush)
- ereport(ERROR,
- errmsg_internal("cannot synchronize local slot \"%s\"",
- remote_slot->name),
- errdetail_internal("Local slot's start streaming location LSN(%X/%08X) is ahead of remote slot's LSN(%X/%08X).",
- LSN_FORMAT_ARGS(slot->data.confirmed_flush),
- LSN_FORMAT_ARGS(remote_slot->confirmed_lsn)));
+ /* Skip the sync of an invalidated slot */
+ if (slot->data.invalidated != RS_INVAL_NONE)
+ {
+ ReplicationSlotRelease();
+ return slot_updated;
+ }
- slot_updated = update_local_synced_slot(remote_slot, remote_dbid,
- NULL, NULL);
+ /* Slot not ready yet, let's attempt to make it sync-ready now. */
+ if (slot->data.persistency == RS_TEMPORARY)
+ {
+ slot_updated = update_and_persist_local_synced_slot(remote_slot,
+ remote_dbid);
+ }
+
+ /* Slot ready for sync, so sync it. */
+ else
+ {
+ /*
+ * Sanity check: As long as the invalidations are handled
+ * appropriately as above, this should never happen.
+ *
+ * We don't need to check restart_lsn here. See the comments in
+ * update_local_synced_slot() for details.
+ */
+ if (remote_slot->confirmed_lsn < slot->data.confirmed_flush)
+ ereport(ERROR,
+ errmsg_internal("cannot synchronize local slot \"%s\"",
+ remote_slot->name),
+ errdetail_internal("Local slot's start streaming location LSN(%X/%08X) is ahead of remote slot's LSN(%X/%08X).",
+ LSN_FORMAT_ARGS(slot->data.confirmed_flush),
+ LSN_FORMAT_ARGS(remote_slot->confirmed_lsn)));
+
+ slot_updated = update_local_synced_slot(remote_slot, remote_dbid,
+ NULL, NULL);
+ }
}
}
/* Otherwise create the slot first. */
@@ -760,6 +828,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
remote_slot->two_phase,
remote_slot->failover,
+ true,
true);
/* For shorter lines. */
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fd0fdb9..fcba4d4 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -351,7 +351,7 @@ IsSlotForConflictCheck(const char *name)
void
ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
- bool two_phase, bool failover, bool synced)
+ bool two_phase, bool failover, bool allow_overwrite, bool synced)
{
ReplicationSlot *slot = NULL;
int i;
@@ -445,6 +445,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
slot->data.two_phase = two_phase;
slot->data.two_phase_at = InvalidXLogRecPtr;
slot->data.failover = failover;
+ slot->data.allow_overwrite = allow_overwrite;
slot->data.synced = synced;
/* and then data only present in shared memory */
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index b8f2115..1a50244 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -41,7 +41,7 @@ create_physical_replication_slot(char *name, bool immediately_reserve,
/* acquire replication slot, this will check for conflicting names */
ReplicationSlotCreate(name, false,
temporary ? RS_TEMPORARY : RS_PERSISTENT, false,
- false, false);
+ false, false, false);
if (immediately_reserve)
{
@@ -116,7 +116,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
static void
create_logical_replication_slot(char *name, char *plugin,
bool temporary, bool two_phase,
- bool failover,
+ bool failover, bool allow_overwrite,
XLogRecPtr restart_lsn,
bool find_startpoint)
{
@@ -134,7 +134,7 @@ create_logical_replication_slot(char *name, char *plugin,
*/
ReplicationSlotCreate(name, true,
temporary ? RS_TEMPORARY : RS_EPHEMERAL, two_phase,
- failover, false);
+ failover, allow_overwrite, false);
/*
* Create logical decoding context to find start point or, if we don't
@@ -173,6 +173,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
bool temporary = PG_GETARG_BOOL(2);
bool two_phase = PG_GETARG_BOOL(3);
bool failover = PG_GETARG_BOOL(4);
+ bool allow_overwrite = PG_GETARG_BOOL(5);
Datum result;
TupleDesc tupdesc;
HeapTuple tuple;
@@ -191,6 +192,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
temporary,
two_phase,
failover,
+ allow_overwrite,
InvalidXLogRecPtr,
true);
@@ -726,6 +728,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
temporary,
false,
false,
+ false,
src_restart_lsn,
false);
}
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 59822f2..40586b2 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1202,7 +1202,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
{
ReplicationSlotCreate(cmd->slotname, false,
cmd->temporary ? RS_TEMPORARY : RS_PERSISTENT,
- false, false, false);
+ false, false, false, false);
if (reserve_wal)
{
@@ -1233,7 +1233,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
*/
ReplicationSlotCreate(cmd->slotname, true,
cmd->temporary ? RS_TEMPORARY : RS_EPHEMERAL,
- two_phase, failover, false);
+ two_phase, failover, false, false);
/*
* Do options check early so that we can bail before calling the
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 01eba3b..fbaacfb 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11510,10 +11510,10 @@
{ oid => '3786', descr => 'set up a logical replication slot',
proname => 'pg_create_logical_replication_slot', provolatile => 'v',
proparallel => 'u', prorettype => 'record',
- proargtypes => 'name name bool bool bool',
- proallargtypes => '{name,name,bool,bool,bool,name,pg_lsn}',
- proargmodes => '{i,i,i,i,i,o,o}',
- proargnames => '{slot_name,plugin,temporary,twophase,failover,slot_name,lsn}',
+ proargtypes => 'name name bool bool bool bool',
+ proallargtypes => '{name,name,bool,bool,bool,bool,name,pg_lsn}',
+ proargmodes => '{i,i,i,i,i,i,o,o}',
+ proargnames => '{slot_name,plugin,temporary,twophase,failover,allow_overwrite,slot_name,lsn}',
prosrc => 'pg_create_logical_replication_slot' },
{ oid => '4222',
descr => 'copy a logical replication slot, changing temporality and plugin',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index fe62162..ca50b79 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -141,6 +141,12 @@ typedef struct ReplicationSlotPersistentData
* for logical slots on the primary server.
*/
bool failover;
+
+ /*
+ * Allow Postgres to drop logical replication slot on standby server to ensure
+ * creation of new failover slot when remote sync_replication_slots is set to true.
+ */
+ bool allow_overwrite;
} ReplicationSlotPersistentData;
/*
@@ -301,7 +307,7 @@ extern void ReplicationSlotsShmemInit(void);
/* management of individual slots */
extern void ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
- bool two_phase, bool failover,
+ bool two_phase, bool failover, bool allow_overwrite,
bool synced);
extern void ReplicationSlotPersist(void);
extern void ReplicationSlotDrop(const char *name, bool nowait);
--
2.47.3
Import Notes
Reply to msg id not found: CAA5-nLCLS4f0kFJK96n_RW5ZxWO5APePBu6FMpd9BQe33oqPg@mail.gmail.com
On Tue, Sep 30, 2025 at 12:17 PM Fabrice Chapuis
<fabrice636861@gmail.com> wrote:
Hi,
Here the generated v2 of the Patch.
Thanks. I have refactored the code for synchronize_one_slot() as there
was some code-repetition. Please take it if you find it okay.
Also I felt that when we create a slot through slot-synchronization,
we should create it with allow_overwrite as false. And thus in the
attached patch, I have changed that part as well.
It is a top up patch. Attached it as txt, please rename before
applying atop your changes. Attached the steps for your reference.
thanks
Shveta
Attachments:
0001-refactored-synchronize_one_slot.patch.txttext/plain; charset=US-ASCII; name=0001-refactored-synchronize_one_slot.patch.txtDownload
From e99ef98faad4a2ff32daa4db36daceddf8225cf7 Mon Sep 17 00:00:00 2001
From: Shveta Malik <shveta.malik@gmail.com>
Date: Tue, 30 Sep 2025 14:54:41 +0530
Subject: [PATCH] refactored synchronize_one_slot
---
src/backend/replication/logical/slotsync.c | 209 ++++++++-------------
1 file changed, 82 insertions(+), 127 deletions(-)
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index dbf9380cb39..a8d44fe6b45 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -649,6 +649,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
return false;
}
+retry:
/* Search for the named slot */
if ((slot = SearchNamedReplicationSlot(remote_slot->name, true)))
{
@@ -666,146 +667,100 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
/* Check if we need to overwrite an existing logical slot */
if (allow_overwrite)
{
- /* Recreate the failover slot and synchronize it */
- NameData plugin_name;
- TransactionId xmin_horizon = InvalidTransactionId;
-
ereport(LOG,
- errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("slot \"%s\" already exists"
- " on the standby but try to recreate it because "
- "flag allow_overwrite is set to true",
- remote_slot->name));
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("slot \"%s\" already exists"
+ " on the standby but will be overwritten as"
+ " allow_overwrite is set to true",
+ remote_slot->name));
/* Get rid of a replication slot that is no longer wanted */
- ReplicationSlotDrop(remote_slot->name,true);
-
- /* Skip creating the local slot if remote_slot is invalidated already */
- if (remote_slot->invalidated != RS_INVAL_NONE)
- return false;
-
- /*
- * We create temporary slots instead of ephemeral slots here because
- * we want the slots to survive after releasing them. This is done to
- * avoid dropping and re-creating the slots in each synchronization
- * cycle if the restart_lsn or catalog_xmin of the remote slot has not
- * caught up.
- */
- ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
- remote_slot->two_phase,
- remote_slot->failover,
- true,
- true);
-
- /* For shorter lines. */
- slot = MyReplicationSlot;
-
- /* Avoid expensive operations while holding a spinlock. */
- namestrcpy(&plugin_name, remote_slot->plugin);
-
- SpinLockAcquire(&slot->mutex);
- slot->data.database = remote_dbid;
- slot->data.plugin = plugin_name;
- SpinLockRelease(&slot->mutex);
-
- reserve_wal_for_local_slot(remote_slot->restart_lsn);
-
- LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
- xmin_horizon = GetOldestSafeDecodingTransactionId(true);
- SpinLockAcquire(&slot->mutex);
- slot->effective_catalog_xmin = xmin_horizon;
- slot->data.catalog_xmin = xmin_horizon;
- SpinLockRelease(&slot->mutex);
- ReplicationSlotsComputeRequiredXmin(true);
- LWLockRelease(ProcArrayLock);
-
- update_and_persist_local_synced_slot(remote_slot, remote_dbid);
-
- slot_updated = true;
+ ReplicationSlotAcquire(remote_slot->name, true, false);
+ ReplicationSlotDropAcquired();
+ goto retry;
}
else
+ {
ereport(ERROR,
- errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("exiting from slot synchronization because same"
- " name slot \"%s\" already exists on the standby",
- remote_slot->name));
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("exiting from slot synchronization because same"
+ " name slot \"%s\" already exists on the standby",
+ remote_slot->name));
+ }
}
- else
- {
- /*
- * The slot has been synchronized before.
- *
- * It is important to acquire the slot here before checking
- * invalidation. If we don't acquire the slot first, there could be a
- * race condition that the local slot could be invalidated just after
- * checking the 'invalidated' flag here and we could end up
- * overwriting 'invalidated' flag to remote_slot's value. See
- * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
- * if the slot is not acquired by other processes.
- *
- * XXX: If it ever turns out that slot acquire/release is costly for
- * cases when none of the slot properties is changed then we can do a
- * pre-check to ensure that at least one of the slot properties is
- * changed before acquiring the slot.
- */
- ReplicationSlotAcquire(remote_slot->name, true, false);
-
- Assert(slot == MyReplicationSlot);
+ /*
+ * The slot has been synchronized before.
+ *
+ * It is important to acquire the slot here before checking
+ * invalidation. If we don't acquire the slot first, there could be a
+ * race condition that the local slot could be invalidated just after
+ * checking the 'invalidated' flag here and we could end up
+ * overwriting 'invalidated' flag to remote_slot's value. See
+ * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
+ * if the slot is not acquired by other processes.
+ *
+ * XXX: If it ever turns out that slot acquire/release is costly for
+ * cases when none of the slot properties is changed then we can do a
+ * pre-check to ensure that at least one of the slot properties is
+ * changed before acquiring the slot.
+ */
+ ReplicationSlotAcquire(remote_slot->name, true, false);
- /*
- * Copy the invalidation cause from remote only if local slot is not
- * invalidated locally, we don't want to overwrite existing one.
- */
- if (slot->data.invalidated == RS_INVAL_NONE &&
- remote_slot->invalidated != RS_INVAL_NONE)
- {
- SpinLockAcquire(&slot->mutex);
- slot->data.invalidated = remote_slot->invalidated;
- SpinLockRelease(&slot->mutex);
+ Assert(slot == MyReplicationSlot);
- /* Make sure the invalidated state persists across server restart */
- ReplicationSlotMarkDirty();
- ReplicationSlotSave();
+ /*
+ * Copy the invalidation cause from remote only if local slot is not
+ * invalidated locally, we don't want to overwrite existing one.
+ */
+ if (slot->data.invalidated == RS_INVAL_NONE &&
+ remote_slot->invalidated != RS_INVAL_NONE)
+ {
+ SpinLockAcquire(&slot->mutex);
+ slot->data.invalidated = remote_slot->invalidated;
+ SpinLockRelease(&slot->mutex);
- slot_updated = true;
- }
+ /* Make sure the invalidated state persists across server restart */
+ ReplicationSlotMarkDirty();
+ ReplicationSlotSave();
- /* Skip the sync of an invalidated slot */
- if (slot->data.invalidated != RS_INVAL_NONE)
- {
- ReplicationSlotRelease();
- return slot_updated;
- }
+ slot_updated = true;
+ }
- /* Slot not ready yet, let's attempt to make it sync-ready now. */
- if (slot->data.persistency == RS_TEMPORARY)
- {
- slot_updated = update_and_persist_local_synced_slot(remote_slot,
- remote_dbid);
- }
+ /* Skip the sync of an invalidated slot */
+ if (slot->data.invalidated != RS_INVAL_NONE)
+ {
+ ReplicationSlotRelease();
+ return slot_updated;
+ }
- /* Slot ready for sync, so sync it. */
- else
- {
- /*
- * Sanity check: As long as the invalidations are handled
- * appropriately as above, this should never happen.
- *
- * We don't need to check restart_lsn here. See the comments in
- * update_local_synced_slot() for details.
- */
- if (remote_slot->confirmed_lsn < slot->data.confirmed_flush)
- ereport(ERROR,
- errmsg_internal("cannot synchronize local slot \"%s\"",
- remote_slot->name),
- errdetail_internal("Local slot's start streaming location LSN(%X/%08X) is ahead of remote slot's LSN(%X/%08X).",
- LSN_FORMAT_ARGS(slot->data.confirmed_flush),
- LSN_FORMAT_ARGS(remote_slot->confirmed_lsn)));
-
- slot_updated = update_local_synced_slot(remote_slot, remote_dbid,
- NULL, NULL);
- }
+ /* Slot not ready yet, let's attempt to make it sync-ready now. */
+ if (slot->data.persistency == RS_TEMPORARY)
+ {
+ slot_updated = update_and_persist_local_synced_slot(remote_slot,
+ remote_dbid);
+ }
+
+ /* Slot ready for sync, so sync it. */
+ else
+ {
+ /*
+ * Sanity check: As long as the invalidations are handled
+ * appropriately as above, this should never happen.
+ *
+ * We don't need to check restart_lsn here. See the comments in
+ * update_local_synced_slot() for details.
+ */
+ if (remote_slot->confirmed_lsn < slot->data.confirmed_flush)
+ ereport(ERROR,
+ errmsg_internal("cannot synchronize local slot \"%s\"",
+ remote_slot->name),
+ errdetail_internal("Local slot's start streaming location LSN(%X/%08X) is ahead of remote slot's LSN(%X/%08X).",
+ LSN_FORMAT_ARGS(slot->data.confirmed_flush),
+ LSN_FORMAT_ARGS(remote_slot->confirmed_lsn)));
+
+ slot_updated = update_local_synced_slot(remote_slot, remote_dbid,
+ NULL, NULL);
}
}
/* Otherwise create the slot first. */
@@ -828,7 +783,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
remote_slot->two_phase,
remote_slot->failover,
- true,
+ false,
true);
/* For shorter lines. */
--
2.34.1
On Tue, Sep 30, 2025 at 3:11 PM shveta malik <shveta.malik@gmail.com> wrote:
On Tue, Sep 30, 2025 at 12:17 PM Fabrice Chapuis
<fabrice636861@gmail.com> wrote:Hi,
Here the generated v2 of the Patch.
Thanks. I have refactored the code for synchronize_one_slot() as there
was some code-repetition. Please take it if you find it okay.Also I felt that when we create a slot through slot-synchronization,
we should create it with allow_overwrite as false. And thus in the
attached patch, I have changed that part as well.It is a top up patch. Attached it as txt, please rename before
applying atop your changes. Attached the steps for your reference.
Next, pg_replication_slots (pg_get_replication_slots) can be modified
to display the new property. And the corresponding doc for
pg_replication_slots (doc/src/sgml/system-views.sgml) can be updated.
thanks
Shveta
Thanks. I have refactored the code for synchronize_one_slot() as there
was some code-repetition. Please take it if you find it okay.
Ok for the goto retry.
Also I felt that when we create a slot through slot-synchronization,
we should create it with allow_overwrite as false. And thus in the
attached patch, I have changed that part as well.
The problem if allow_overwrite is set to false on the new standby it does
not reflect the "allow_overwrite" parameter value set on the the primary
instance.
At the moment we can let the slot creation with allow_overwrite set to
false.
It is a top up patch. Attached it as txt, please rename before
applying atop your changes. Attached the steps for your reference.
The patch work fine. I generate the v3.
Regards
Fabrice
On Tue, Sep 30, 2025 at 11:41 AM shveta malik <shveta.malik@gmail.com>
wrote:
Show quoted text
On Tue, Sep 30, 2025 at 12:17 PM Fabrice Chapuis
<fabrice636861@gmail.com> wrote:Hi,
Here the generated v2 of the Patch.
Thanks. I have refactored the code for synchronize_one_slot() as there
was some code-repetition. Please take it if you find it okay.Also I felt that when we create a slot through slot-synchronization,
we should create it with allow_overwrite as false. And thus in the
attached patch, I have changed that part as well.It is a top up patch. Attached it as txt, please rename before
applying atop your changes. Attached the steps for your reference.thanks
Shveta
Attachments:
v3-0001-Add-allow_overwrite-option-to-logical-replication.patchapplication/octet-stream; name=v3-0001-Add-allow_overwrite-option-to-logical-replication.patchDownload
From 3723bae39873517e185577acf0c3ca82c08edd3c Mon Sep 17 00:00:00 2001
From: Fabrice Chapuis <fabrice636861@gmail.com>
Date: Mon, 29 Sep 2025 13:10:38 +0200
Subject: [PATCH v3] Add allow_overwrite option to logical replication slot
creation
This patch adds a new parameter allow_overwrite to
pg_create_logical_replication_slot. When true, an existing
failover slot on standby will be dropped and replaced
automatically.
Signed-off-by: Fabrice Chapuis <fabrice636861@gmail.com>
---
src/backend/catalog/system_functions.sql | 1 +
src/backend/replication/logical/launcher.c | 2 +-
src/backend/replication/logical/slotsync.c | 34 ++++++++++++++++++----
src/backend/replication/slot.c | 3 +-
src/backend/replication/slotfuncs.c | 9 ++++--
src/backend/replication/walsender.c | 4 +--
src/include/catalog/pg_proc.dat | 8 ++---
src/include/replication/slot.h | 8 ++++-
8 files changed, 52 insertions(+), 17 deletions(-)
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 2d946d6..75ea84f 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -480,6 +480,7 @@ CREATE OR REPLACE FUNCTION pg_create_logical_replication_slot(
IN temporary boolean DEFAULT false,
IN twophase boolean DEFAULT false,
IN failover boolean DEFAULT false,
+ IN allow_overwrite boolean DEFAULT false,
OUT slot_name name, OUT lsn pg_lsn)
RETURNS RECORD
LANGUAGE INTERNAL
diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 218cefe..a26af52 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -1519,7 +1519,7 @@ CreateConflictDetectionSlot(void)
errmsg("creating replication conflict detection slot"));
ReplicationSlotCreate(CONFLICT_DETECTION_SLOT, false, RS_PERSISTENT, false,
- false, false);
+ false, false, false);
init_conflict_slot_xmin();
}
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 8c061d5..a8d44fe 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -649,22 +649,45 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
return false;
}
+retry:
/* Search for the named slot */
if ((slot = SearchNamedReplicationSlot(remote_slot->name, true)))
{
bool synced;
+ bool allow_overwrite;
SpinLockAcquire(&slot->mutex);
synced = slot->data.synced;
+ allow_overwrite = slot->data.allow_overwrite;
SpinLockRelease(&slot->mutex);
/* User-created slot with the same name exists, raise ERROR. */
if (!synced)
- ereport(ERROR,
- errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("exiting from slot synchronization because same"
- " name slot \"%s\" already exists on the standby",
- remote_slot->name));
+ {
+ /* Check if we need to overwrite an existing logical slot */
+ if (allow_overwrite)
+ {
+ ereport(LOG,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("slot \"%s\" already exists"
+ " on the standby but will be overwritten as"
+ " allow_overwrite is set to true",
+ remote_slot->name));
+
+ /* Get rid of a replication slot that is no longer wanted */
+ ReplicationSlotAcquire(remote_slot->name, true, false);
+ ReplicationSlotDropAcquired();
+ goto retry;
+ }
+ else
+ {
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("exiting from slot synchronization because same"
+ " name slot \"%s\" already exists on the standby",
+ remote_slot->name));
+ }
+ }
/*
* The slot has been synchronized before.
@@ -760,6 +783,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
remote_slot->two_phase,
remote_slot->failover,
+ false,
true);
/* For shorter lines. */
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fd0fdb9..fcba4d4 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -351,7 +351,7 @@ IsSlotForConflictCheck(const char *name)
void
ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
- bool two_phase, bool failover, bool synced)
+ bool two_phase, bool failover, bool allow_overwrite, bool synced)
{
ReplicationSlot *slot = NULL;
int i;
@@ -445,6 +445,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
slot->data.two_phase = two_phase;
slot->data.two_phase_at = InvalidXLogRecPtr;
slot->data.failover = failover;
+ slot->data.allow_overwrite = allow_overwrite;
slot->data.synced = synced;
/* and then data only present in shared memory */
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index b8f2115..1a50244 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -41,7 +41,7 @@ create_physical_replication_slot(char *name, bool immediately_reserve,
/* acquire replication slot, this will check for conflicting names */
ReplicationSlotCreate(name, false,
temporary ? RS_TEMPORARY : RS_PERSISTENT, false,
- false, false);
+ false, false, false);
if (immediately_reserve)
{
@@ -116,7 +116,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
static void
create_logical_replication_slot(char *name, char *plugin,
bool temporary, bool two_phase,
- bool failover,
+ bool failover, bool allow_overwrite,
XLogRecPtr restart_lsn,
bool find_startpoint)
{
@@ -134,7 +134,7 @@ create_logical_replication_slot(char *name, char *plugin,
*/
ReplicationSlotCreate(name, true,
temporary ? RS_TEMPORARY : RS_EPHEMERAL, two_phase,
- failover, false);
+ failover, allow_overwrite, false);
/*
* Create logical decoding context to find start point or, if we don't
@@ -173,6 +173,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
bool temporary = PG_GETARG_BOOL(2);
bool two_phase = PG_GETARG_BOOL(3);
bool failover = PG_GETARG_BOOL(4);
+ bool allow_overwrite = PG_GETARG_BOOL(5);
Datum result;
TupleDesc tupdesc;
HeapTuple tuple;
@@ -191,6 +192,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
temporary,
two_phase,
failover,
+ allow_overwrite,
InvalidXLogRecPtr,
true);
@@ -726,6 +728,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
temporary,
false,
false,
+ false,
src_restart_lsn,
false);
}
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 59822f2..40586b2 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1202,7 +1202,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
{
ReplicationSlotCreate(cmd->slotname, false,
cmd->temporary ? RS_TEMPORARY : RS_PERSISTENT,
- false, false, false);
+ false, false, false, false);
if (reserve_wal)
{
@@ -1233,7 +1233,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
*/
ReplicationSlotCreate(cmd->slotname, true,
cmd->temporary ? RS_TEMPORARY : RS_EPHEMERAL,
- two_phase, failover, false);
+ two_phase, failover, false, false);
/*
* Do options check early so that we can bail before calling the
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 01eba3b..fbaacfb 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11510,10 +11510,10 @@
{ oid => '3786', descr => 'set up a logical replication slot',
proname => 'pg_create_logical_replication_slot', provolatile => 'v',
proparallel => 'u', prorettype => 'record',
- proargtypes => 'name name bool bool bool',
- proallargtypes => '{name,name,bool,bool,bool,name,pg_lsn}',
- proargmodes => '{i,i,i,i,i,o,o}',
- proargnames => '{slot_name,plugin,temporary,twophase,failover,slot_name,lsn}',
+ proargtypes => 'name name bool bool bool bool',
+ proallargtypes => '{name,name,bool,bool,bool,bool,name,pg_lsn}',
+ proargmodes => '{i,i,i,i,i,i,o,o}',
+ proargnames => '{slot_name,plugin,temporary,twophase,failover,allow_overwrite,slot_name,lsn}',
prosrc => 'pg_create_logical_replication_slot' },
{ oid => '4222',
descr => 'copy a logical replication slot, changing temporality and plugin',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index fe62162..ca50b79 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -141,6 +141,12 @@ typedef struct ReplicationSlotPersistentData
* for logical slots on the primary server.
*/
bool failover;
+
+ /*
+ * Allow Postgres to drop logical replication slot on standby server to ensure
+ * creation of new failover slot when remote sync_replication_slots is set to true.
+ */
+ bool allow_overwrite;
} ReplicationSlotPersistentData;
/*
@@ -301,7 +307,7 @@ extern void ReplicationSlotsShmemInit(void);
/* management of individual slots */
extern void ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
- bool two_phase, bool failover,
+ bool two_phase, bool failover, bool allow_overwrite,
bool synced);
extern void ReplicationSlotPersist(void);
extern void ReplicationSlotDrop(const char *name, bool nowait);
--
2.47.3
Next, pg_replication_slots (pg_get_replication_slots) can be modified
to display the new property. And the corresponding doc for
pg_replication_slots (doc/src/sgml/system-views.sgml) can be updated.
Yes, next I'll work on this part.
Regards,
Fabrice
On Tue, Sep 30, 2025 at 11:45 AM shveta malik <shveta.malik@gmail.com>
wrote:
Show quoted text
On Tue, Sep 30, 2025 at 3:11 PM shveta malik <shveta.malik@gmail.com>
wrote:On Tue, Sep 30, 2025 at 12:17 PM Fabrice Chapuis
<fabrice636861@gmail.com> wrote:Hi,
Here the generated v2 of the Patch.
Thanks. I have refactored the code for synchronize_one_slot() as there
was some code-repetition. Please take it if you find it okay.Also I felt that when we create a slot through slot-synchronization,
we should create it with allow_overwrite as false. And thus in the
attached patch, I have changed that part as well.It is a top up patch. Attached it as txt, please rename before
applying atop your changes. Attached the steps for your reference.Next, pg_replication_slots (pg_get_replication_slots) can be modified
to display the new property. And the corresponding doc for
pg_replication_slots (doc/src/sgml/system-views.sgml) can be updated.thanks
Shveta
Hi Shveta,
Here is the v4 of the patch with pg_replication_slots view modified to
display the field allow_overwrite. Doc was also updated.
Regards,
Fabrice
On Tue, Sep 30, 2025 at 11:45 AM shveta malik <shveta.malik@gmail.com>
wrote:
Show quoted text
On Tue, Sep 30, 2025 at 3:11 PM shveta malik <shveta.malik@gmail.com>
wrote:On Tue, Sep 30, 2025 at 12:17 PM Fabrice Chapuis
<fabrice636861@gmail.com> wrote:Hi,
Here the generated v2 of the Patch.
Thanks. I have refactored the code for synchronize_one_slot() as there
was some code-repetition. Please take it if you find it okay.Also I felt that when we create a slot through slot-synchronization,
we should create it with allow_overwrite as false. And thus in the
attached patch, I have changed that part as well.It is a top up patch. Attached it as txt, please rename before
applying atop your changes. Attached the steps for your reference.Next, pg_replication_slots (pg_get_replication_slots) can be modified
to display the new property. And the corresponding doc for
pg_replication_slots (doc/src/sgml/system-views.sgml) can be updated.thanks
Shveta
Attachments:
v4-0001-Add-allow_overwrite-option-to-logical-replication.patchtext/x-patch; charset=US-ASCII; name=v4-0001-Add-allow_overwrite-option-to-logical-replication.patchDownload
From 25e513bf49f3afe4b6ed7333ae46705c357eeb47 Mon Sep 17 00:00:00 2001
From: Fabrice Chapuis <fabrice636861@gmail.com>
Date: Mon, 29 Sep 2025 13:10:38 +0200
Subject: [PATCH v4] Add allow_overwrite option to logical replication slot
creation
This patch adds a new parameter allow_overwrite to
pg_create_logical_replication_slot. When true, an existing
failover slot on standby will be dropped and replaced
automatically.
Signed-off-by: Fabrice Chapuis <fabrice636861@gmail.com>
---
doc/src/sgml/system-views.sgml | 12 ++++++++
src/backend/catalog/system_functions.sql | 1 +
src/backend/catalog/system_views.sql | 3 +-
src/backend/replication/logical/launcher.c | 2 +-
src/backend/replication/logical/slotsync.c | 34 ++++++++++++++++++----
src/backend/replication/slot.c | 3 +-
src/backend/replication/slotfuncs.c | 13 ++++++---
src/backend/replication/walsender.c | 4 +--
src/include/catalog/pg_proc.dat | 14 ++++-----
src/include/replication/slot.h | 8 ++++-
10 files changed, 72 insertions(+), 22 deletions(-)
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 4187191ea74..0d94622cd0c 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -3036,6 +3036,18 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>allow_overwrite</structfield> <type>bool</type>
+ </para>
+ <para>
+ The parameter controls whether existing logical replication slot metadata may
+ be overwritten on the standby instance. The default is <literal>false</literal>.
+ When true, an existing failover slot on standby instance will be dropped and replaced
+ automatically.
+ </para></entry>
+ </row>
+
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 2d946d6d9e9..75ea84fd837 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -480,6 +480,7 @@ CREATE OR REPLACE FUNCTION pg_create_logical_replication_slot(
IN temporary boolean DEFAULT false,
IN twophase boolean DEFAULT false,
IN failover boolean DEFAULT false,
+ IN allow_overwrite boolean DEFAULT false,
OUT slot_name name, OUT lsn pg_lsn)
RETURNS RECORD
LANGUAGE INTERNAL
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index c77fa0234bb..7fc42ddbe66 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1046,7 +1046,8 @@ CREATE VIEW pg_replication_slots AS
L.conflicting,
L.invalidation_reason,
L.failover,
- L.synced
+ L.synced,
+ L.allow_overwrite
FROM pg_get_replication_slots() AS L
LEFT JOIN pg_database D ON (L.datoid = D.oid);
diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 218cefe86e2..a26af527775 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -1519,7 +1519,7 @@ CreateConflictDetectionSlot(void)
errmsg("creating replication conflict detection slot"));
ReplicationSlotCreate(CONFLICT_DETECTION_SLOT, false, RS_PERSISTENT, false,
- false, false);
+ false, false, false);
init_conflict_slot_xmin();
}
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 8c061d55bdb..a8d44fe6b45 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -649,22 +649,45 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
return false;
}
+retry:
/* Search for the named slot */
if ((slot = SearchNamedReplicationSlot(remote_slot->name, true)))
{
bool synced;
+ bool allow_overwrite;
SpinLockAcquire(&slot->mutex);
synced = slot->data.synced;
+ allow_overwrite = slot->data.allow_overwrite;
SpinLockRelease(&slot->mutex);
/* User-created slot with the same name exists, raise ERROR. */
if (!synced)
- ereport(ERROR,
- errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("exiting from slot synchronization because same"
- " name slot \"%s\" already exists on the standby",
- remote_slot->name));
+ {
+ /* Check if we need to overwrite an existing logical slot */
+ if (allow_overwrite)
+ {
+ ereport(LOG,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("slot \"%s\" already exists"
+ " on the standby but will be overwritten as"
+ " allow_overwrite is set to true",
+ remote_slot->name));
+
+ /* Get rid of a replication slot that is no longer wanted */
+ ReplicationSlotAcquire(remote_slot->name, true, false);
+ ReplicationSlotDropAcquired();
+ goto retry;
+ }
+ else
+ {
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("exiting from slot synchronization because same"
+ " name slot \"%s\" already exists on the standby",
+ remote_slot->name));
+ }
+ }
/*
* The slot has been synchronized before.
@@ -760,6 +783,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
remote_slot->two_phase,
remote_slot->failover,
+ false,
true);
/* For shorter lines. */
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fd0fdb96d42..fcba4d4cff9 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -351,7 +351,7 @@ IsSlotForConflictCheck(const char *name)
void
ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
- bool two_phase, bool failover, bool synced)
+ bool two_phase, bool failover, bool allow_overwrite, bool synced)
{
ReplicationSlot *slot = NULL;
int i;
@@ -445,6 +445,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
slot->data.two_phase = two_phase;
slot->data.two_phase_at = InvalidXLogRecPtr;
slot->data.failover = failover;
+ slot->data.allow_overwrite = allow_overwrite;
slot->data.synced = synced;
/* and then data only present in shared memory */
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index b8f21153e7b..6a6a03e9a4d 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -41,7 +41,7 @@ create_physical_replication_slot(char *name, bool immediately_reserve,
/* acquire replication slot, this will check for conflicting names */
ReplicationSlotCreate(name, false,
temporary ? RS_TEMPORARY : RS_PERSISTENT, false,
- false, false);
+ false, false, false);
if (immediately_reserve)
{
@@ -116,7 +116,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
static void
create_logical_replication_slot(char *name, char *plugin,
bool temporary, bool two_phase,
- bool failover,
+ bool failover, bool allow_overwrite,
XLogRecPtr restart_lsn,
bool find_startpoint)
{
@@ -134,7 +134,7 @@ create_logical_replication_slot(char *name, char *plugin,
*/
ReplicationSlotCreate(name, true,
temporary ? RS_TEMPORARY : RS_EPHEMERAL, two_phase,
- failover, false);
+ failover, allow_overwrite, false);
/*
* Create logical decoding context to find start point or, if we don't
@@ -173,6 +173,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
bool temporary = PG_GETARG_BOOL(2);
bool two_phase = PG_GETARG_BOOL(3);
bool failover = PG_GETARG_BOOL(4);
+ bool allow_overwrite = PG_GETARG_BOOL(5);
Datum result;
TupleDesc tupdesc;
HeapTuple tuple;
@@ -191,6 +192,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
temporary,
two_phase,
failover,
+ allow_overwrite,
InvalidXLogRecPtr,
true);
@@ -235,7 +237,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
Datum
pg_get_replication_slots(PG_FUNCTION_ARGS)
{
-#define PG_GET_REPLICATION_SLOTS_COLS 20
+#define PG_GET_REPLICATION_SLOTS_COLS 21
ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
XLogRecPtr currlsn;
int slotno;
@@ -443,6 +445,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
values[i++] = BoolGetDatum(slot_contents.data.synced);
+ values[i++] = BoolGetDatum(slot_contents.data.allow_overwrite);
+
Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
@@ -726,6 +730,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
temporary,
false,
false,
+ false,
src_restart_lsn,
false);
}
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 59822f22b8d..40586b273ba 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1202,7 +1202,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
{
ReplicationSlotCreate(cmd->slotname, false,
cmd->temporary ? RS_TEMPORARY : RS_PERSISTENT,
- false, false, false);
+ false, false, false, false);
if (reserve_wal)
{
@@ -1233,7 +1233,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
*/
ReplicationSlotCreate(cmd->slotname, true,
cmd->temporary ? RS_TEMPORARY : RS_EPHEMERAL,
- two_phase, failover, false);
+ two_phase, failover, false, false);
/*
* Do options check early so that we can bail before calling the
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 01eba3b5a19..2f980c0dd4f 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11503,17 +11503,17 @@
proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', prorettype => 'record',
proargtypes => '',
- proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,pg_lsn,timestamptz,bool,text,bool,bool}',
- proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
- proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,two_phase_at,inactive_since,conflicting,invalidation_reason,failover,synced}',
+ proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,pg_lsn,timestamptz,bool,text,bool,bool,bool}',
+ proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,two_phase_at,inactive_since,conflicting,invalidation_reason,failover,synced,allow_overwrite}',
prosrc => 'pg_get_replication_slots' },
{ oid => '3786', descr => 'set up a logical replication slot',
proname => 'pg_create_logical_replication_slot', provolatile => 'v',
proparallel => 'u', prorettype => 'record',
- proargtypes => 'name name bool bool bool',
- proallargtypes => '{name,name,bool,bool,bool,name,pg_lsn}',
- proargmodes => '{i,i,i,i,i,o,o}',
- proargnames => '{slot_name,plugin,temporary,twophase,failover,slot_name,lsn}',
+ proargtypes => 'name name bool bool bool bool',
+ proallargtypes => '{name,name,bool,bool,bool,bool,name,pg_lsn}',
+ proargmodes => '{i,i,i,i,i,i,o,o}',
+ proargnames => '{slot_name,plugin,temporary,twophase,failover,allow_overwrite,slot_name,lsn}',
prosrc => 'pg_create_logical_replication_slot' },
{ oid => '4222',
descr => 'copy a logical replication slot, changing temporality and plugin',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index fe62162cde3..ca50b79daaa 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -141,6 +141,12 @@ typedef struct ReplicationSlotPersistentData
* for logical slots on the primary server.
*/
bool failover;
+
+ /*
+ * Allow Postgres to drop logical replication slot on standby server to ensure
+ * creation of new failover slot when remote sync_replication_slots is set to true.
+ */
+ bool allow_overwrite;
} ReplicationSlotPersistentData;
/*
@@ -301,7 +307,7 @@ extern void ReplicationSlotsShmemInit(void);
/* management of individual slots */
extern void ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
- bool two_phase, bool failover,
+ bool two_phase, bool failover, bool allow_overwrite,
bool synced);
extern void ReplicationSlotPersist(void);
extern void ReplicationSlotDrop(const char *name, bool nowait);
--
2.39.5
On Fri, Oct 3, 2025 at 1:28 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:
Hi Shveta,
Here is the v4 of the patch with pg_replication_slots view modified to display the field allow_overwrite. Doc was also updated.
The patch looks okay. The parameter name is still open for discussion,
and the comments could be improved. But we can focus on these finer
details once more reviewers start reviewing and there’s general
agreement on the concept.
One trivial comment: we can slightly modify the doc to have something like this:
This parameter controls whether an existing logical replication slot
on the standby (with synced=false) can be overwritten during logical
replication slot synchronization (see Section 47.2.3). The default is
false. When true, an existing user slot with the same name on the
standby will be synchronized using the primary’s failover slot.
<please see high-availability.sgml to find how 'Section 47.2.3' can be
referenced in the doc>
~~
The next step will be to provide a way to modify this parameter via an
alter API, say pg_alter_logical_replication_slot(). This API can later
be extended to handle other parameters. This API can be implemented in
patch002 for easier review.
thanks
Shveta
On Wed, Oct 8, 2025 at 12:27 PM shveta malik <shveta.malik@gmail.com> wrote:
On Fri, Oct 3, 2025 at 1:28 PM Fabrice Chapuis <fabrice636861@gmail.com>
wrote:Hi Shveta,
Here is the v4 of the patch with pg_replication_slots view modified todisplay the field allow_overwrite. Doc was also updated.
The patch looks okay. The parameter name is still open for discussion,
and the comments could be improved. But we can focus on these finer
details once more reviewers start reviewing and there’s general
agreement on the concept.One trivial comment: we can slightly modify the doc to have something like
this:This parameter controls whether an existing logical replication slot
on the standby (with synced=false) can be overwritten during logical
replication slot synchronization (see Section 47.2.3). The default is
false. When true, an existing user slot with the same name on the
standby will be synchronized using the primary’s failover slot.<please see high-availability.sgml to find how 'Section 47.2.3' can be
referenced in the doc>
~~The next step will be to provide a way to modify this parameter via an
alter API, say pg_alter_logical_replication_slot(). This API can later
be extended to handle other parameters. This API can be implemented in
patch002 for easier review.thanks
Shveta
Hi,
Here is the patch V5, I change with your doc text proposition and the
link.
At this stage, the patch can be submitted to the current commit fest for
review?
With Regards,
Fabrice
Attachments:
v5-0001-Add-allow_overwrite-option-to-logical-replication.patchtext/x-patch; charset=UTF-8; name=v5-0001-Add-allow_overwrite-option-to-logical-replication.patchDownload
From 68bda4e973c95ff9d24a6fc866187903d04d3d21 Mon Sep 17 00:00:00 2001
From: Fabrice Chapuis <fabrice636861@gmail.com>
Date: Mon, 29 Sep 2025 13:10:38 +0200
Subject: [PATCH v5] Add allow_overwrite option to logical replication slot
creation
This patch adds a new parameter allow_overwrite to
pg_create_logical_replication_slot. When true, an existing
failover slot on standby will be dropped and replaced
automatically.
Signed-off-by: Fabrice Chapuis <fabrice636861@gmail.com>
---
doc/src/sgml/system-views.sgml | 14 +++++++++
src/backend/catalog/system_functions.sql | 1 +
src/backend/catalog/system_views.sql | 3 +-
src/backend/replication/logical/launcher.c | 2 +-
src/backend/replication/logical/slotsync.c | 34 ++++++++++++++++++----
src/backend/replication/slot.c | 3 +-
src/backend/replication/slotfuncs.c | 13 ++++++---
src/backend/replication/walsender.c | 4 +--
src/include/catalog/pg_proc.dat | 14 ++++-----
src/include/replication/slot.h | 8 ++++-
10 files changed, 74 insertions(+), 22 deletions(-)
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 4187191ea74..1a7bb7d2a4b 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -3036,6 +3036,20 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>allow_overwrite</structfield> <type>bool</type>
+ </para>
+ <para>
+ This parameter controls whether an existing logical replication slot
+ on the standby (with synced=false) can be overwritten during logical
+ replication slot synchronization (see
+ <xref linkend="logicaldecoding-replication-slots-synchronization"/>). The default is
+ <literal>false</literal>. When <literal>true</literal>, an existing user slot with the same name on the
+ standby will be synchronized using the primary’s failover slot.
+ </para></entry>
+ </row>
+
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 2d946d6d9e9..75ea84fd837 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -480,6 +480,7 @@ CREATE OR REPLACE FUNCTION pg_create_logical_replication_slot(
IN temporary boolean DEFAULT false,
IN twophase boolean DEFAULT false,
IN failover boolean DEFAULT false,
+ IN allow_overwrite boolean DEFAULT false,
OUT slot_name name, OUT lsn pg_lsn)
RETURNS RECORD
LANGUAGE INTERNAL
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index c77fa0234bb..7fc42ddbe66 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1046,7 +1046,8 @@ CREATE VIEW pg_replication_slots AS
L.conflicting,
L.invalidation_reason,
L.failover,
- L.synced
+ L.synced,
+ L.allow_overwrite
FROM pg_get_replication_slots() AS L
LEFT JOIN pg_database D ON (L.datoid = D.oid);
diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 218cefe86e2..a26af527775 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -1519,7 +1519,7 @@ CreateConflictDetectionSlot(void)
errmsg("creating replication conflict detection slot"));
ReplicationSlotCreate(CONFLICT_DETECTION_SLOT, false, RS_PERSISTENT, false,
- false, false);
+ false, false, false);
init_conflict_slot_xmin();
}
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 8c061d55bdb..a8d44fe6b45 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -649,22 +649,45 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
return false;
}
+retry:
/* Search for the named slot */
if ((slot = SearchNamedReplicationSlot(remote_slot->name, true)))
{
bool synced;
+ bool allow_overwrite;
SpinLockAcquire(&slot->mutex);
synced = slot->data.synced;
+ allow_overwrite = slot->data.allow_overwrite;
SpinLockRelease(&slot->mutex);
/* User-created slot with the same name exists, raise ERROR. */
if (!synced)
- ereport(ERROR,
- errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("exiting from slot synchronization because same"
- " name slot \"%s\" already exists on the standby",
- remote_slot->name));
+ {
+ /* Check if we need to overwrite an existing logical slot */
+ if (allow_overwrite)
+ {
+ ereport(LOG,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("slot \"%s\" already exists"
+ " on the standby but will be overwritten as"
+ " allow_overwrite is set to true",
+ remote_slot->name));
+
+ /* Get rid of a replication slot that is no longer wanted */
+ ReplicationSlotAcquire(remote_slot->name, true, false);
+ ReplicationSlotDropAcquired();
+ goto retry;
+ }
+ else
+ {
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("exiting from slot synchronization because same"
+ " name slot \"%s\" already exists on the standby",
+ remote_slot->name));
+ }
+ }
/*
* The slot has been synchronized before.
@@ -760,6 +783,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
remote_slot->two_phase,
remote_slot->failover,
+ false,
true);
/* For shorter lines. */
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fd0fdb96d42..fcba4d4cff9 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -351,7 +351,7 @@ IsSlotForConflictCheck(const char *name)
void
ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
- bool two_phase, bool failover, bool synced)
+ bool two_phase, bool failover, bool allow_overwrite, bool synced)
{
ReplicationSlot *slot = NULL;
int i;
@@ -445,6 +445,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
slot->data.two_phase = two_phase;
slot->data.two_phase_at = InvalidXLogRecPtr;
slot->data.failover = failover;
+ slot->data.allow_overwrite = allow_overwrite;
slot->data.synced = synced;
/* and then data only present in shared memory */
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index b8f21153e7b..6a6a03e9a4d 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -41,7 +41,7 @@ create_physical_replication_slot(char *name, bool immediately_reserve,
/* acquire replication slot, this will check for conflicting names */
ReplicationSlotCreate(name, false,
temporary ? RS_TEMPORARY : RS_PERSISTENT, false,
- false, false);
+ false, false, false);
if (immediately_reserve)
{
@@ -116,7 +116,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
static void
create_logical_replication_slot(char *name, char *plugin,
bool temporary, bool two_phase,
- bool failover,
+ bool failover, bool allow_overwrite,
XLogRecPtr restart_lsn,
bool find_startpoint)
{
@@ -134,7 +134,7 @@ create_logical_replication_slot(char *name, char *plugin,
*/
ReplicationSlotCreate(name, true,
temporary ? RS_TEMPORARY : RS_EPHEMERAL, two_phase,
- failover, false);
+ failover, allow_overwrite, false);
/*
* Create logical decoding context to find start point or, if we don't
@@ -173,6 +173,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
bool temporary = PG_GETARG_BOOL(2);
bool two_phase = PG_GETARG_BOOL(3);
bool failover = PG_GETARG_BOOL(4);
+ bool allow_overwrite = PG_GETARG_BOOL(5);
Datum result;
TupleDesc tupdesc;
HeapTuple tuple;
@@ -191,6 +192,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
temporary,
two_phase,
failover,
+ allow_overwrite,
InvalidXLogRecPtr,
true);
@@ -235,7 +237,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
Datum
pg_get_replication_slots(PG_FUNCTION_ARGS)
{
-#define PG_GET_REPLICATION_SLOTS_COLS 20
+#define PG_GET_REPLICATION_SLOTS_COLS 21
ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
XLogRecPtr currlsn;
int slotno;
@@ -443,6 +445,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
values[i++] = BoolGetDatum(slot_contents.data.synced);
+ values[i++] = BoolGetDatum(slot_contents.data.allow_overwrite);
+
Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
@@ -726,6 +730,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
temporary,
false,
false,
+ false,
src_restart_lsn,
false);
}
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 59822f22b8d..40586b273ba 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1202,7 +1202,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
{
ReplicationSlotCreate(cmd->slotname, false,
cmd->temporary ? RS_TEMPORARY : RS_PERSISTENT,
- false, false, false);
+ false, false, false, false);
if (reserve_wal)
{
@@ -1233,7 +1233,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
*/
ReplicationSlotCreate(cmd->slotname, true,
cmd->temporary ? RS_TEMPORARY : RS_EPHEMERAL,
- two_phase, failover, false);
+ two_phase, failover, false, false);
/*
* Do options check early so that we can bail before calling the
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 01eba3b5a19..2f980c0dd4f 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11503,17 +11503,17 @@
proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', prorettype => 'record',
proargtypes => '',
- proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,pg_lsn,timestamptz,bool,text,bool,bool}',
- proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
- proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,two_phase_at,inactive_since,conflicting,invalidation_reason,failover,synced}',
+ proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,pg_lsn,timestamptz,bool,text,bool,bool,bool}',
+ proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,two_phase_at,inactive_since,conflicting,invalidation_reason,failover,synced,allow_overwrite}',
prosrc => 'pg_get_replication_slots' },
{ oid => '3786', descr => 'set up a logical replication slot',
proname => 'pg_create_logical_replication_slot', provolatile => 'v',
proparallel => 'u', prorettype => 'record',
- proargtypes => 'name name bool bool bool',
- proallargtypes => '{name,name,bool,bool,bool,name,pg_lsn}',
- proargmodes => '{i,i,i,i,i,o,o}',
- proargnames => '{slot_name,plugin,temporary,twophase,failover,slot_name,lsn}',
+ proargtypes => 'name name bool bool bool bool',
+ proallargtypes => '{name,name,bool,bool,bool,bool,name,pg_lsn}',
+ proargmodes => '{i,i,i,i,i,i,o,o}',
+ proargnames => '{slot_name,plugin,temporary,twophase,failover,allow_overwrite,slot_name,lsn}',
prosrc => 'pg_create_logical_replication_slot' },
{ oid => '4222',
descr => 'copy a logical replication slot, changing temporality and plugin',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index fe62162cde3..ca50b79daaa 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -141,6 +141,12 @@ typedef struct ReplicationSlotPersistentData
* for logical slots on the primary server.
*/
bool failover;
+
+ /*
+ * Allow Postgres to drop logical replication slot on standby server to ensure
+ * creation of new failover slot when remote sync_replication_slots is set to true.
+ */
+ bool allow_overwrite;
} ReplicationSlotPersistentData;
/*
@@ -301,7 +307,7 @@ extern void ReplicationSlotsShmemInit(void);
/* management of individual slots */
extern void ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
- bool two_phase, bool failover,
+ bool two_phase, bool failover, bool allow_overwrite,
bool synced);
extern void ReplicationSlotPersist(void);
extern void ReplicationSlotDrop(const char *name, bool nowait);
--
2.39.5
On Thu, Oct 9, 2025 at 6:26 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:
Hi,
Here is the patch V5, I change with your doc text proposition and the link.
There is a trailing whitespace warning while applying the patch.
Regarding:
+ on the standby (with synced=false) can be overwritten during logical
Can we replace 'with synced=false' with:
with <structfield>synced</structfield> field as <literal>false</literal>
At this stage, the patch can be submitted to the current commit fest for review?
Sure. Please register the patch to CF and continue developing it further.
thanks
Shveta
Hi,
Here is the v6 of the patch that I'll put in the current CF.
Regards,
Fabrice
On Tue, Oct 14, 2025 at 8:47 AM shveta malik <shveta.malik@gmail.com> wrote:
Show quoted text
On Thu, Oct 9, 2025 at 6:26 PM Fabrice Chapuis <fabrice636861@gmail.com>
wrote:Hi,
Here is the patch V5, I change with your doc text proposition and thelink.
There is a trailing whitespace warning while applying the patch.
Regarding:
+ on the standby (with synced=false) can be overwritten during
logicalCan we replace 'with synced=false' with:
with <structfield>synced</structfield> field as <literal>false</literal>
At this stage, the patch can be submitted to the current commit fest for
review?
Sure. Please register the patch to CF and continue developing it further.
thanks
Shveta
Attachments:
v6-0001-Add-allow_overwrite-option-to-logical-replication.patchapplication/octet-stream; name=v6-0001-Add-allow_overwrite-option-to-logical-replication.patchDownload
From 05fb2b5a254c82d49459c018ab2702730bc0d791 Mon Sep 17 00:00:00 2001
From: Fabrice Chapuis <fabrice636861@gmail.com>
Date: Wed, 15 Oct 2025 11:50:44 +0200
Subject: [PATCH v6] Add allow_overwrite option to logical replication slot
creation
This patch adds a new parameter allow_overwrite to
pg_create_logical_replication_slot. When true, an existing
failover slot on standby will be dropped and replaced
automatically.
---
doc/src/sgml/system-views.sgml | 15 ++++++++++
src/backend/catalog/system_functions.sql | 1 +
src/backend/catalog/system_views.sql | 3 +-
src/backend/replication/logical/launcher.c | 2 +-
src/backend/replication/logical/slotsync.c | 34 ++++++++++++++++++----
src/backend/replication/slot.c | 3 +-
src/backend/replication/slotfuncs.c | 13 ++++++---
src/backend/replication/walsender.c | 4 +--
src/include/catalog/pg_proc.dat | 14 ++++-----
src/include/replication/slot.h | 8 ++++-
10 files changed, 75 insertions(+), 22 deletions(-)
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 7971498..41cbf3f 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -3102,6 +3102,21 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>allow_overwrite</structfield> <type>bool</type>
+ </para>
+ <para>
+ This parameter controls whether an existing logical replication slot
+ on the standby (with <structfield>synced</structfield> field as <literal>false</literal>)
+ can be overwritten during logical
+ replication slot synchronization (see
+ <xref linkend="logicaldecoding-replication-slots-synchronization"/>). The default is
+ <literal>false</literal>. When <literal>true</literal>, an existing user slot with the same name on the
+ standby will be synchronized using the primary’s failover slot.
+ </para></entry>
+ </row>
+
</tbody>
</tgroup>
</table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 2d946d6..75ea84f 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -480,6 +480,7 @@ CREATE OR REPLACE FUNCTION pg_create_logical_replication_slot(
IN temporary boolean DEFAULT false,
IN twophase boolean DEFAULT false,
IN failover boolean DEFAULT false,
+ IN allow_overwrite boolean DEFAULT false,
OUT slot_name name, OUT lsn pg_lsn)
RETURNS RECORD
LANGUAGE INTERNAL
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 823776c..9203d08 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1060,7 +1060,8 @@ CREATE VIEW pg_replication_slots AS
L.conflicting,
L.invalidation_reason,
L.failover,
- L.synced
+ L.synced,
+ L.allow_overwrite
FROM pg_get_replication_slots() AS L
LEFT JOIN pg_database D ON (L.datoid = D.oid);
diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 218cefe..a26af52 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -1519,7 +1519,7 @@ CreateConflictDetectionSlot(void)
errmsg("creating replication conflict detection slot"));
ReplicationSlotCreate(CONFLICT_DETECTION_SLOT, false, RS_PERSISTENT, false,
- false, false);
+ false, false, false);
init_conflict_slot_xmin();
}
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 8c061d5..a8d44fe 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -649,22 +649,45 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
return false;
}
+retry:
/* Search for the named slot */
if ((slot = SearchNamedReplicationSlot(remote_slot->name, true)))
{
bool synced;
+ bool allow_overwrite;
SpinLockAcquire(&slot->mutex);
synced = slot->data.synced;
+ allow_overwrite = slot->data.allow_overwrite;
SpinLockRelease(&slot->mutex);
/* User-created slot with the same name exists, raise ERROR. */
if (!synced)
- ereport(ERROR,
- errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("exiting from slot synchronization because same"
- " name slot \"%s\" already exists on the standby",
- remote_slot->name));
+ {
+ /* Check if we need to overwrite an existing logical slot */
+ if (allow_overwrite)
+ {
+ ereport(LOG,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("slot \"%s\" already exists"
+ " on the standby but will be overwritten as"
+ " allow_overwrite is set to true",
+ remote_slot->name));
+
+ /* Get rid of a replication slot that is no longer wanted */
+ ReplicationSlotAcquire(remote_slot->name, true, false);
+ ReplicationSlotDropAcquired();
+ goto retry;
+ }
+ else
+ {
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("exiting from slot synchronization because same"
+ " name slot \"%s\" already exists on the standby",
+ remote_slot->name));
+ }
+ }
/*
* The slot has been synchronized before.
@@ -760,6 +783,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
remote_slot->two_phase,
remote_slot->failover,
+ false,
true);
/* For shorter lines. */
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index ac188bb..84a1838 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -351,7 +351,7 @@ IsSlotForConflictCheck(const char *name)
void
ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
- bool two_phase, bool failover, bool synced)
+ bool two_phase, bool failover, bool allow_overwrite, bool synced)
{
ReplicationSlot *slot = NULL;
int i;
@@ -445,6 +445,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
slot->data.two_phase = two_phase;
slot->data.two_phase_at = InvalidXLogRecPtr;
slot->data.failover = failover;
+ slot->data.allow_overwrite = allow_overwrite;
slot->data.synced = synced;
/* and then data only present in shared memory */
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index b8f2115..6a6a03e 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -41,7 +41,7 @@ create_physical_replication_slot(char *name, bool immediately_reserve,
/* acquire replication slot, this will check for conflicting names */
ReplicationSlotCreate(name, false,
temporary ? RS_TEMPORARY : RS_PERSISTENT, false,
- false, false);
+ false, false, false);
if (immediately_reserve)
{
@@ -116,7 +116,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
static void
create_logical_replication_slot(char *name, char *plugin,
bool temporary, bool two_phase,
- bool failover,
+ bool failover, bool allow_overwrite,
XLogRecPtr restart_lsn,
bool find_startpoint)
{
@@ -134,7 +134,7 @@ create_logical_replication_slot(char *name, char *plugin,
*/
ReplicationSlotCreate(name, true,
temporary ? RS_TEMPORARY : RS_EPHEMERAL, two_phase,
- failover, false);
+ failover, allow_overwrite, false);
/*
* Create logical decoding context to find start point or, if we don't
@@ -173,6 +173,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
bool temporary = PG_GETARG_BOOL(2);
bool two_phase = PG_GETARG_BOOL(3);
bool failover = PG_GETARG_BOOL(4);
+ bool allow_overwrite = PG_GETARG_BOOL(5);
Datum result;
TupleDesc tupdesc;
HeapTuple tuple;
@@ -191,6 +192,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
temporary,
two_phase,
failover,
+ allow_overwrite,
InvalidXLogRecPtr,
true);
@@ -235,7 +237,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
Datum
pg_get_replication_slots(PG_FUNCTION_ARGS)
{
-#define PG_GET_REPLICATION_SLOTS_COLS 20
+#define PG_GET_REPLICATION_SLOTS_COLS 21
ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
XLogRecPtr currlsn;
int slotno;
@@ -443,6 +445,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
values[i++] = BoolGetDatum(slot_contents.data.synced);
+ values[i++] = BoolGetDatum(slot_contents.data.allow_overwrite);
+
Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
@@ -726,6 +730,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
temporary,
false,
false,
+ false,
src_restart_lsn,
false);
}
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 59822f2..40586b2 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1202,7 +1202,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
{
ReplicationSlotCreate(cmd->slotname, false,
cmd->temporary ? RS_TEMPORARY : RS_PERSISTENT,
- false, false, false);
+ false, false, false, false);
if (reserve_wal)
{
@@ -1233,7 +1233,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
*/
ReplicationSlotCreate(cmd->slotname, true,
cmd->temporary ? RS_TEMPORARY : RS_EPHEMERAL,
- two_phase, failover, false);
+ two_phase, failover, false, false);
/*
* Do options check early so that we can bail before calling the
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index b51d2b1..8ef7619 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11511,17 +11511,17 @@
proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
proretset => 't', provolatile => 's', prorettype => 'record',
proargtypes => '',
- proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,pg_lsn,timestamptz,bool,text,bool,bool}',
- proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
- proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,two_phase_at,inactive_since,conflicting,invalidation_reason,failover,synced}',
+ proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,pg_lsn,timestamptz,bool,text,bool,bool,bool}',
+ proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,two_phase_at,inactive_since,conflicting,invalidation_reason,failover,synced,allow_overwrite}',
prosrc => 'pg_get_replication_slots' },
{ oid => '3786', descr => 'set up a logical replication slot',
proname => 'pg_create_logical_replication_slot', provolatile => 'v',
proparallel => 'u', prorettype => 'record',
- proargtypes => 'name name bool bool bool',
- proallargtypes => '{name,name,bool,bool,bool,name,pg_lsn}',
- proargmodes => '{i,i,i,i,i,o,o}',
- proargnames => '{slot_name,plugin,temporary,twophase,failover,slot_name,lsn}',
+ proargtypes => 'name name bool bool bool bool',
+ proallargtypes => '{name,name,bool,bool,bool,bool,name,pg_lsn}',
+ proargmodes => '{i,i,i,i,i,i,o,o}',
+ proargnames => '{slot_name,plugin,temporary,twophase,failover,allow_overwrite,slot_name,lsn}',
prosrc => 'pg_create_logical_replication_slot' },
{ oid => '4222',
descr => 'copy a logical replication slot, changing temporality and plugin',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index fe62162..ca50b79 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -141,6 +141,12 @@ typedef struct ReplicationSlotPersistentData
* for logical slots on the primary server.
*/
bool failover;
+
+ /*
+ * Allow Postgres to drop logical replication slot on standby server to ensure
+ * creation of new failover slot when remote sync_replication_slots is set to true.
+ */
+ bool allow_overwrite;
} ReplicationSlotPersistentData;
/*
@@ -301,7 +307,7 @@ extern void ReplicationSlotsShmemInit(void);
/* management of individual slots */
extern void ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
- bool two_phase, bool failover,
+ bool two_phase, bool failover, bool allow_overwrite,
bool synced);
extern void ReplicationSlotPersist(void);
extern void ReplicationSlotDrop(const char *name, bool nowait);
--
2.47.3
On Wed, Oct 15, 2025 at 3:29 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:
Hi,
Here is the v6 of the patch that I'll put in the current CF.
Okay, sounds good.
thanks
Shveta
Hi all,
I don't think we really need to have allow_overwrite.
It is not possible to create logical slots on standby with failover=true,
therefore we can safely rely on failover being true to understand that at
some point this node was a primary and that this slot is supposed to be
synced.
Please see the patch attached.
Regards,
--
Alexander Kukushkin
Attachments:
0001-Continue-slots-synchronization-after-switchover.patchtext/x-patch; charset=US-ASCII; name=0001-Continue-slots-synchronization-after-switchover.patchDownload
From 469324c263c5ee09ae0f6d75761e2607e08a9e67 Mon Sep 17 00:00:00 2001
From: Alexander Kukushkin <cyberdemn@gmail.com>
Date: Tue, 28 Oct 2025 13:32:43 +0100
Subject: [PATCH] Continue slots synchronization after switchover
When the former primary is started up after clean shutdown as a standby
it was refusing to synchronize logical failover slots because synced is
false with the error: "exiting from slot synchronization because same
name slot "<name>" already exists on the standby".
Since we don't allow creation of logical replication slots with
failover=true on standby it is safe to check that replication slot
has failover=true and in this case set synced=true and continue.
Besides that, we change filter in drop_local_obsolete_slots() function,
because 'failover' better defines purpose of the slot.
---
src/backend/replication/logical/slotsync.c | 31 ++++++++++++++++------
1 file changed, 23 insertions(+), 8 deletions(-)
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index b122d99b009..55d6f2fb9e9 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -455,10 +455,11 @@ drop_local_obsolete_slots(List *remote_slot_list)
* slot by the user. This new user-created slot may end up using
* the same shared memory as that of 'local_slot'. Thus check if
* local_slot is still the synced one before performing actual
- * drop.
+ * drop. Yes, we actually check 'failover', not 'synced', because
+ * it could have been created on primary which is now a standby.
*/
SpinLockAcquire(&local_slot->mutex);
- synced_slot = local_slot->in_use && local_slot->data.synced;
+ synced_slot = local_slot->in_use && local_slot->data.failover;
SpinLockRelease(&local_slot->mutex);
if (synced_slot)
@@ -653,18 +654,32 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
if ((slot = SearchNamedReplicationSlot(remote_slot->name, true)))
{
bool synced;
+ bool failover;
SpinLockAcquire(&slot->mutex);
synced = slot->data.synced;
+ failover = slot->data.failover;
SpinLockRelease(&slot->mutex);
- /* User-created slot with the same name exists, raise ERROR. */
if (!synced)
- ereport(ERROR,
- errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("exiting from slot synchronization because same"
- " name slot \"%s\" already exists on the standby",
- remote_slot->name));
+ {
+ /* User-created slot with the same name exists, raise ERROR. */
+ if (!failover)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("exiting from slot synchronization because same"
+ " name slot \"%s\" already exists on the standby",
+ remote_slot->name));
+
+ /*
+ * At some point we were a primary, and it was expected to have
+ * synced = false and failover = true. In this case we want to set
+ * synced = true and continue synchronization.
+ */
+ SpinLockAcquire(&slot->mutex);
+ slot->data.synced = true;
+ SpinLockRelease(&slot->mutex);
+ }
/*
* The slot has been synchronized before.
--
2.34.1
On Tue, Oct 28, 2025 at 6:33 PM Alexander Kukushkin <cyberdemn@gmail.com> wrote:
Hi all,
I don't think we really need to have allow_overwrite.
It is not possible to create logical slots on standby with failover=true, therefore we can safely rely on failover being true to understand that at some point this node was a primary and that this slot is supposed to be synced.
Please see the patch attached.
We had discussed this point in another thread, please see [1]/messages/by-id/OS0PR01MB57161FF469DE049765DD53A89475A@OS0PR01MB5716.jpnprd01.prod.outlook.com. After
discussion it was decided to not go this way.
[1]: /messages/by-id/OS0PR01MB57161FF469DE049765DD53A89475A@OS0PR01MB5716.jpnprd01.prod.outlook.com
thanks
Shveta
Hi,
I indeed proposed a solution at the top of this thread to modify only the
value of the synced attribute, but the discussion was redirected to adding
an extra parameter to the function *pg_create_logical_replication_slot() *to
overwrite a failover slot
Regards,
Fabrice
On Fri, Oct 31, 2025 at 5:32 AM shveta malik <shveta.malik@gmail.com> wrote:
Show quoted text
On Tue, Oct 28, 2025 at 6:33 PM Alexander Kukushkin <cyberdemn@gmail.com>
wrote:Hi all,
I don't think we really need to have allow_overwrite.
It is not possible to create logical slots on standby withfailover=true, therefore we can safely rely on failover being true to
understand that at some point this node was a primary and that this slot is
supposed to be synced.Please see the patch attached.
We had discussed this point in another thread, please see [1]. After
discussion it was decided to not go this way.[1]:
/messages/by-id/OS0PR01MB57161FF469DE049765DD53A89475A@OS0PR01MB5716.jpnprd01.prod.outlook.comthanks
Shveta
Hi,
On Fri, 31 Oct 2025 at 09:16, Fabrice Chapuis <fabrice636861@gmail.com>
wrote:
Hi,
I indeed proposed a solution at the top of this thread to modify only the
value of the synced attribute, but the discussion was redirected to adding
an extra parameter to the function *pg_create_logical_replication_slot() *to
overwrite a failover slotWe had discussed this point in another thread, please see [1]. After
discussion it was decided to not go this way.[1]:
/messages/by-id/OS0PR01MB57161FF469DE049765DD53A89475A@OS0PR01MB5716.jpnprd01.prod.outlook.com
I’ve read through the referenced discussion, and my impression is that we
might be trying to design a solution around assumptions that are unlikely
to hold in practice.
There was an argument that at some point we might allow creating logical
failover slots on cascading standbys. However, if we consider all practical
scenarios, it seems very unlikely that such a feature could work reliably
with the current design.
Let me try to explain why.
Consider the following setup:
node1 - primary
node2 - standby, replicating from node1
node3 - standby, replicating from node1, has logical slot foo
(failover=true, synced=false)
node4 - standby, replicating from node3, has logical slot foo
(failover=true, synced=true)
1) If node1 fails, we could promote either node2 or node3:
1.a) If we promote node2, we must first create a physical slot for node3,
update primary_conninfo on node3 to point to node2, wait until node3
connects, and until catalog_xmin on the physical slot becomes non-NULL.
Only then would it be safe to promote node2. This introduces unnecessary
steps, complexity, and waiting — increasing downtime, which defeats the
goal of high availability.
1.b) If we promote node3, promotion itself is fast, but subscribers will
still be using the slot on the original primary. This again defeats the
purpose of doing logical replication from a standby, and it won’t be
possible to switch subscribers to node4 (see below).
2) If node3 fails, we might want to replace it with node4. But node4 has a
slot with failover=true and synced=true, and synced=true prevents it from
being used for streaming because it’s a standby.
In other words, with the current design, allowing creation of logical
failover slots on standbys doesn’t bring any real benefit — such “synced”
slots can’t actually be used later.
One could argue that we could add a function to switch synced=true->false
on a standby, but that would just add another workaround on top of an
already fragile design, increasing operational complexity without solving
the underlying issue.
The same applies to proposals like allow_overwrite. If such a flag is
introduced, in practice it will almost always be used unconditionally, e.g.:
SELECT pg_create_logical_replication_slot('<name>', '<plugin>', failover :=
true, allow_overwrite := true);
Right now, logical failover slots can’t be reused after a switchover, which
is a perfectly normal operation.
The only current workaround is to detect standbys with failover=true,
synced=false and drop those slots, hoping they’ll be resynchronized. But
resynchronization is asynchronous, unpredictable, and may take an unbounded
amount of time. If the primary fails during that window, there might be no
standby with ready logical slots.
Instead of dropping such slots, what we actually need is a way to safely
set synced=false->true and continue operating.
Operating logical replication setups is already extremely complex and
error-prone — this is not theoretical, it’s something many of us face daily.
So rather than adding more speculative features or workarounds, I think we
should focus on addressing real operational pain points and the
inconsistencies in the current design.
A slot created on the primary (which later becomes a standby) with
failover=true has a very clear purpose. The failover flag already indicates
that purpose; synced shouldn’t override it.
Regards,
--
Alexander Kukushkin
Thanks for your large analyze and explanation Alexander. If we go in the
direction you propose and leave the option to use a suplementary flag
allow_overwrite, may I propose you some modifications in the patch v0 you
have attached:
Why modifying this function?
drop_local_obsolete_slots(List *remote_slot_list)
List *local_slots = get_local_synced_slots(); => as the failover slot
we check has synced = false then it wont' enter the loop and dropping the
slot
If we want to resynchronize the slot properly why not to drop it and
recreate cleanly in place of putting the synced flag to true although the
slot is not actually synchronized. Here is the code I wrote in the patch
version 6 with the check on the failover flag.
retry:
/* Search for the named slot */
if ((slot = SearchNamedReplicationSlot(remote_slot->name, true)))
{
bool synced;
bool failover;
SpinLockAcquire(&slot->mutex);
synced = slot->data.synced;
failover = slot->data.failover;
SpinLockRelease(&slot->mutex);
if (!synced)
{
/* User-created slot with the same name exists, raise ERROR. */
if (!failover)
ereport(ERROR,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("exiting from slot synchronization because same"
" name slot \"%s\" already exists on the standby",
remote_slot->name));
/*
* At some point we were a primary, and it was expected to have
* synced = false and failover = true. To resynchronize the slot we could
* drop it and replay the code for the slot to be recreated cleanly.
*/
ereport(LOG,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("slot \"%s\" already exists"
" on the standby but will be drop and recreated to be
resynchronized",
remote_slot->name));
/* Get rid of a replication slot that is no longer wanted */
ReplicationSlotAcquire(remote_slot->name, true, false);
ReplicationSlotDropAcquired();
goto retry;
}
Thanks four your appreciation and feedback
Regards,
Fabrice
On Fri, Oct 31, 2025 at 10:28 AM Alexander Kukushkin <cyberdemn@gmail.com>
wrote:
Show quoted text
Hi,
On Fri, 31 Oct 2025 at 09:16, Fabrice Chapuis <fabrice636861@gmail.com>
wrote:Hi,
I indeed proposed a solution at the top of this thread to modify only the
value of the synced attribute, but the discussion was redirected to adding
an extra parameter to the function *pg_create_logical_replication_slot()
*to overwrite a failover slotWe had discussed this point in another thread, please see [1]. After
discussion it was decided to not go this way.[1]:
/messages/by-id/OS0PR01MB57161FF469DE049765DD53A89475A@OS0PR01MB5716.jpnprd01.prod.outlook.comI’ve read through the referenced discussion, and my impression is that we
might be trying to design a solution around assumptions that are unlikely
to hold in practice.
There was an argument that at some point we might allow creating logical
failover slots on cascading standbys. However, if we consider all practical
scenarios, it seems very unlikely that such a feature could work reliably
with the current design.
Let me try to explain why.Consider the following setup:
node1 - primary
node2 - standby, replicating from node1
node3 - standby, replicating from node1, has logical slot foo
(failover=true, synced=false)
node4 - standby, replicating from node3, has logical slot foo
(failover=true, synced=true)1) If node1 fails, we could promote either node2 or node3:
1.a) If we promote node2, we must first create a physical slot for node3,
update primary_conninfo on node3 to point to node2, wait until node3
connects, and until catalog_xmin on the physical slot becomes non-NULL.
Only then would it be safe to promote node2. This introduces unnecessary
steps, complexity, and waiting — increasing downtime, which defeats the
goal of high availability.
1.b) If we promote node3, promotion itself is fast, but subscribers will
still be using the slot on the original primary. This again defeats the
purpose of doing logical replication from a standby, and it won’t be
possible to switch subscribers to node4 (see below).
2) If node3 fails, we might want to replace it with node4. But node4 has a
slot with failover=true and synced=true, and synced=true prevents it from
being used for streaming because it’s a standby.In other words, with the current design, allowing creation of logical
failover slots on standbys doesn’t bring any real benefit — such “synced”
slots can’t actually be used later.One could argue that we could add a function to switch synced=true->false
on a standby, but that would just add another workaround on top of an
already fragile design, increasing operational complexity without solving
the underlying issue.The same applies to proposals like allow_overwrite. If such a flag is
introduced, in practice it will almost always be used unconditionally, e.g.:
SELECT pg_create_logical_replication_slot('<name>', '<plugin>', failover
:= true, allow_overwrite := true);Right now, logical failover slots can’t be reused after a switchover,
which is a perfectly normal operation.
The only current workaround is to detect standbys with failover=true,
synced=false and drop those slots, hoping they’ll be resynchronized. But
resynchronization is asynchronous, unpredictable, and may take an unbounded
amount of time. If the primary fails during that window, there might be no
standby with ready logical slots.Instead of dropping such slots, what we actually need is a way to safely
set synced=false->true and continue operating.Operating logical replication setups is already extremely complex and
error-prone — this is not theoretical, it’s something many of us face daily.
So rather than adding more speculative features or workarounds, I think we
should focus on addressing real operational pain points and the
inconsistencies in the current design.A slot created on the primary (which later becomes a standby) with
failover=true has a very clear purpose. The failover flag already indicates
that purpose; synced shouldn’t override it.Regards,
--
Alexander Kukushkin
Hi Fabrice,
On Fri, 31 Oct 2025 at 17:45, Fabrice Chapuis <fabrice636861@gmail.com>
wrote:
Thanks for your large analyze and explanation Alexander. If we go in the
direction you propose and leave the option to use a suplementary flag
allow_overwrite, may I propose you some modifications in the patch v0 you
have attached:Why modifying this function?
drop_local_obsolete_slots(List *remote_slot_list)
Think about the following case:
1. We have node1 (primary) and node2 (standby), with slot foo(failover=true)
2. We stop node1 for maintenance and promote node2
3. DROP replication slot on node2
4. Start node1
In this scenario the logical slot "foo" will be left unattended and it
needs to be dropped, because it doesn't exist on the primary.
As I said earlier, IMO it is only the "failover" flag that effectively
indicates the purpose of the slot. The sync flag is purely informative.
Regards,
--
Alexander Kukushkin
Hi Alexander,
I understand your use case but I think it won't work like this because of
how is implemented get_local_synced_slots()
this function will return slots which synced flag to true only.
in point 4 Start node1: synced flag is false and failover flag is true
then it won't enter the loop and the slot will not be dropped
static void
drop_local_obsolete_slots(List *remote_slot_list)
{
List *local_slots = get_local_synced_slots();
foreach_ptr(ReplicationSlot, local_slot, local_slots)
{
/* Drop the local slot if it is not required to be retained. */
if (!local_sync_slot_required(local_slot, remote_slot_list)) => will check
of remote slot exist and if local slot is not invalidated
{
bool synced_slot;
...
We could change the function get_local_synced_slots to make it work
get_local_synced_slots(void)
{
List *local_slots = NIL;
LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
for (int i = 0; i < max_replication_slots; i++)
{
ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
/* Check if it is a synchronized slot */
if (s->in_use && (s->data.failover || s->data.synced))
{
Assert(SlotIsLogical(s));
local_slots = lappend(local_slots, s);
Thanks for your feedback
Regards,
Fabrice
....
On Mon, Nov 3, 2025 at 8:51 AM Alexander Kukushkin <cyberdemn@gmail.com>
wrote:
Show quoted text
Hi Fabrice,
On Fri, 31 Oct 2025 at 17:45, Fabrice Chapuis <fabrice636861@gmail.com>
wrote:Thanks for your large analyze and explanation Alexander. If we go in the
direction you propose and leave the option to use a suplementary flag
allow_overwrite, may I propose you some modifications in the patch v0 you
have attached:Why modifying this function?
drop_local_obsolete_slots(List *remote_slot_list)
Think about the following case:
1. We have node1 (primary) and node2 (standby), with slot
foo(failover=true)
2. We stop node1 for maintenance and promote node2
3. DROP replication slot on node2
4. Start node1In this scenario the logical slot "foo" will be left unattended and it
needs to be dropped, because it doesn't exist on the primary.As I said earlier, IMO it is only the "failover" flag that effectively
indicates the purpose of the slot. The sync flag is purely informative.Regards,
--
Alexander Kukushkin
On Tue, 4 Nov 2025 at 12:15, Fabrice Chapuis <fabrice636861@gmail.com>
wrote:
Hi Alexander,
I understand your use case but I think it won't work like this because of
how is implemented get_local_synced_slots()
this function will return slots which synced flag to true only.
Yeah. Sorry, my bad, you are absolutely right
about get_local_synced_slots() function.
I didn't validate this part of the patch, because first I wanted to get
feedback from hackers.
However, it looks like the people who build it never run Postgres in
production and therefore don't understand the problem and the pain.
I will probably just add yet another workaround to Patroni and start
dropping logical replication slots on standby with failover=true and
synced=false.
--
Regards,
--
Alexander Kukushkin
I propose to modify
get_local_synced_slots() => to get the local logical slots (the name of the
function must be renamed)
and
local_sync_slot_required() => to get also the slots that have the flag
synced = false
then in both our use cases the local slot will be dropped and recreated if
it must be resynchronized.
Changes in synchronize_one_slot() function are not necessary anymore.
static List *
get_local_synced_slots(void)
{
List *local_slots = NIL;
LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
for (int i = 0; i < max_replication_slots; i++)
{
ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
/* Check if it is a synchronized slot */
if (s->in_use && (s->data.synced || s->data.failover))
{
Assert(SlotIsLogical(s));
local_slots = lappend(local_slots, s);
}
}
LWLockRelease(ReplicationSlotControlLock);
return local_slots;
}
static bool
local_sync_slot_required(ReplicationSlot *local_slot, List *remote_slots)
{
bool remote_exists = false;
bool locally_invalidated = false;
bool synced_slot = false;
foreach_ptr(RemoteSlot, remote_slot, remote_slots)
{
if (strcmp(remote_slot->name, NameStr(local_slot->data.name)) == 0)
{
remote_exists = true;
/*
* If remote slot is not invalidated but local slot is marked as
* invalidated, then set locally_invalidated flag.
*/
SpinLockAcquire(&local_slot->mutex);
locally_invalidated =
(remote_slot->invalidated == RS_INVAL_NONE) &&
(local_slot->data.invalidated != RS_INVAL_NONE);
synced_slot = local_slot->data.synced;
SpinLockRelease(&local_slot->mutex);
break;
}
}
return (remote_exists && !locally_invalidated && synced_slot);
}
I could propose a new patch in that sense.
Regards,
Fabrice
On Tue, Nov 4, 2025 at 2:01 PM Alexander Kukushkin <cyberdemn@gmail.com>
wrote:
Show quoted text
On Tue, 4 Nov 2025 at 12:15, Fabrice Chapuis <fabrice636861@gmail.com>
wrote:Hi Alexander,
I understand your use case but I think it won't work like this because of
how is implemented get_local_synced_slots()
this function will return slots which synced flag to true only.Yeah. Sorry, my bad, you are absolutely right
about get_local_synced_slots() function.
I didn't validate this part of the patch, because first I wanted to get
feedback from hackers.
However, it looks like the people who build it never run Postgres in
production and therefore don't understand the problem and the pain.
I will probably just add yet another workaround to Patroni and start
dropping logical replication slots on standby with failover=true and
synced=false.--
Regards,
--
Alexander Kukushkin
Hi,
Here is the new version of the patch
Regards,
Fabrice
On Tue, Nov 4, 2025 at 4:13 PM Fabrice Chapuis <fabrice636861@gmail.com>
wrote:
Show quoted text
I propose to modify
get_local_synced_slots() => to get the local logical slots (the name of
the function must be renamed)
and
local_sync_slot_required() => to get also the slots that have the flag
synced = falsethen in both our use cases the local slot will be dropped and recreated if
it must be resynchronized.
Changes in synchronize_one_slot() function are not necessary anymore.static List *
get_local_synced_slots(void)
{
List *local_slots = NIL;LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
for (int i = 0; i < max_replication_slots; i++)
{
ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];/* Check if it is a synchronized slot */
if (s->in_use && (s->data.synced || s->data.failover))
{
Assert(SlotIsLogical(s));
local_slots = lappend(local_slots, s);
}
}LWLockRelease(ReplicationSlotControlLock);
return local_slots;
}static bool
local_sync_slot_required(ReplicationSlot *local_slot, List *remote_slots)
{
bool remote_exists = false;
bool locally_invalidated = false;
bool synced_slot = false;foreach_ptr(RemoteSlot, remote_slot, remote_slots)
{
if (strcmp(remote_slot->name, NameStr(local_slot->data.name)) == 0)
{
remote_exists = true;/*
* If remote slot is not invalidated but local slot is marked as
* invalidated, then set locally_invalidated flag.
*/
SpinLockAcquire(&local_slot->mutex);
locally_invalidated =
(remote_slot->invalidated == RS_INVAL_NONE) &&
(local_slot->data.invalidated != RS_INVAL_NONE);
synced_slot = local_slot->data.synced;
SpinLockRelease(&local_slot->mutex);break;
}
}return (remote_exists && !locally_invalidated && synced_slot);
}I could propose a new patch in that sense.
Regards,
Fabrice
On Tue, Nov 4, 2025 at 2:01 PM Alexander Kukushkin <cyberdemn@gmail.com>
wrote:On Tue, 4 Nov 2025 at 12:15, Fabrice Chapuis <fabrice636861@gmail.com>
wrote:Hi Alexander,
I understand your use case but I think it won't work like this because
of how is implemented get_local_synced_slots()
this function will return slots which synced flag to true only.Yeah. Sorry, my bad, you are absolutely right
about get_local_synced_slots() function.
I didn't validate this part of the patch, because first I wanted to get
feedback from hackers.
However, it looks like the people who build it never run Postgres in
production and therefore don't understand the problem and the pain.
I will probably just add yet another workaround to Patroni and start
dropping logical replication slots on standby with failover=true and
synced=false.--
Regards,
--
Alexander Kukushkin
Attachments:
v1-0001-Resync-logical-replication-slot-after-switchover.patchapplication/octet-stream; name=v1-0001-Resync-logical-replication-slot-after-switchover.patchDownload
From 07beefc31a07cec4fcc81687be49c563fa359f84 Mon Sep 17 00:00:00 2001
From: Fabrice Chapuis <fabrice636861@gmail.com>
Date: Fri, 7 Nov 2025 18:04:15 +0100
Subject: [PATCH v1] Resync logical replication slot after switchover
When the former primary is started up after clean shutdown,
it was refusing to synchronize the logical failover slots with the new primary
because synced flag is false.
The following error message is produced: "exiting from slot synchronization because same
slot name already exists on the standby".
If local slot has failover=true, it can be overwrite to be synced with
the primary.
This patch mainly change the drop_local_obsolete_slots() function and
check that a local failover slot could be dropped.
---
src/backend/replication/logical/slotsync.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 8c061d5..16e20d4 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -341,7 +341,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid,
}
/*
- * Get the list of local logical slots that are synchronized from the
+ * Get the list of local logical slots that are synchronized or not with the
* primary server.
*/
static List *
@@ -356,7 +356,7 @@ get_local_synced_slots(void)
ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
/* Check if it is a synchronized slot */
- if (s->in_use && s->data.synced)
+ if (s->in_use && (s->data.synced || s->data.failover))
{
Assert(SlotIsLogical(s));
local_slots = lappend(local_slots, s);
@@ -380,6 +380,7 @@ local_sync_slot_required(ReplicationSlot *local_slot, List *remote_slots)
{
bool remote_exists = false;
bool locally_invalidated = false;
+ bool synced_slot = false;
foreach_ptr(RemoteSlot, remote_slot, remote_slots)
{
@@ -395,13 +396,14 @@ local_sync_slot_required(ReplicationSlot *local_slot, List *remote_slots)
locally_invalidated =
(remote_slot->invalidated == RS_INVAL_NONE) &&
(local_slot->data.invalidated != RS_INVAL_NONE);
+ synced_slot = local_slot->data.synced;
SpinLockRelease(&local_slot->mutex);
break;
}
}
- return (remote_exists && !locally_invalidated);
+ return (remote_exists && !locally_invalidated && synced_slot);
}
/*
@@ -455,10 +457,12 @@ drop_local_obsolete_slots(List *remote_slot_list)
* slot by the user. This new user-created slot may end up using
* the same shared memory as that of 'local_slot'. Thus check if
* local_slot is still the synced one before performing actual
- * drop.
+ * drop. Yes, we actually check 'failover', not 'synced', because
+ * it could have been created on primary which is now a standby.
*/
+
SpinLockAcquire(&local_slot->mutex);
- synced_slot = local_slot->in_use && local_slot->data.synced;
+ synced_slot = local_slot->in_use && (local_slot->data.failover || local_slot->data.synced);
SpinLockRelease(&local_slot->mutex);
if (synced_slot)
--
2.47.3
On Fri, Oct 31, 2025 at 2:58 PM Alexander Kukushkin <cyberdemn@gmail.com> wrote:
Instead of dropping such slots, what we actually need is a way to safely set synced=false->true and continue operating.
Operating logical replication setups is already extremely complex and error-prone — this is not theoretical, it’s something many of us face daily.
So rather than adding more speculative features or workarounds, I think we should focus on addressing real operational pain points and the inconsistencies in the current design.A slot created on the primary (which later becomes a standby) with failover=true has a very clear purpose. The failover flag already indicates that purpose; synced shouldn’t override it.
I think this is not as clear as you are saying as compared to WAL. In
failover cases, we bump the WAL timelines on new primary and also have
facilities like pg_rewind to ensure that old primary can follow the
new primary after divergence. For slots, there is no such facility,
now, there is an argument that for slot's it is sufficient to match
the name and failover to say that it is okay to overwrite the slot on
old primary. However, it is not clear whether it is always safe to do
so, for example, if the old primary ran after divergence for sometime
and one has re-created the slot with same name and failover property,
it will no longer be the same slot. Unlike WAL, we don't maintain the
slot's history, so it is not equally clear that we can overwrite old
primary's slot's as it is.
--
With Regards,
Amit Kapila.
Hi Amit,
if I resume your scenario
1. A standby S has a failover slot slot1 synchronized with slot1 on primary
P
2. We promote S
3. On P we drop slot1 and create slot1 again with failover mode (a
subscriber exist on another instance by example)
4. A rewind is performed on P the former primary to rejoin S the former
standby
5. On P slot1 is automatically dropped and recreated to be synchronized
In which context this kind of scenario could happend?
Isn't the goal to find a solution for a switchover which is carried out for
maintenance on a Postgres cluster, the aim is to find a compromise to cover
the most likely scenarios.
Do you think we must come back to the allow_overwrite flag approach or
another solution?
Best Regards,
Fabrice
On Mon, Nov 10, 2025 at 1:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Show quoted text
On Fri, Oct 31, 2025 at 2:58 PM Alexander Kukushkin <cyberdemn@gmail.com>
wrote:Instead of dropping such slots, what we actually need is a way to safely
set synced=false->true and continue operating.
Operating logical replication setups is already extremely complex and
error-prone — this is not theoretical, it’s something many of us face daily.
So rather than adding more speculative features or workarounds, I think
we should focus on addressing real operational pain points and the
inconsistencies in the current design.A slot created on the primary (which later becomes a standby) with
failover=true has a very clear purpose. The failover flag already indicates
that purpose; synced shouldn’t override it.I think this is not as clear as you are saying as compared to WAL. In
failover cases, we bump the WAL timelines on new primary and also have
facilities like pg_rewind to ensure that old primary can follow the
new primary after divergence. For slots, there is no such facility,
now, there is an argument that for slot's it is sufficient to match
the name and failover to say that it is okay to overwrite the slot on
old primary. However, it is not clear whether it is always safe to do
so, for example, if the old primary ran after divergence for sometime
and one has re-created the slot with same name and failover property,
it will no longer be the same slot. Unlike WAL, we don't maintain the
slot's history, so it is not equally clear that we can overwrite old
primary's slot's as it is.--
With Regards,
Amit Kapila.
On Tue, Nov 11, 2025 at 9:27 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:
if I resume your scenario
1. A standby S has a failover slot slot1 synchronized with slot1 on primary P
2. We promote S
3. On P we drop slot1 and create slot1 again with failover mode (a subscriber exist on another instance by example)
4. A rewind is performed on P the former primary to rejoin S the former standby
5. On P slot1 is automatically dropped and recreated to be synchronizedIn which context this kind of scenario could happend?
It is difficult to tell when this can happen but you detailed there is
a theoretical possibility of the same. If we had an in-core cluster
tool that manages nodes on its own which doesn't allow such scenarios
to happen then we could possibly say that using such a tool it is safe
to overwrite old primary's slots.
Isn't the goal to find a solution for a switchover which is carried out for maintenance on a Postgres cluster, the aim is to find a compromise to cover the most likely scenarios.
Yes, that is why I thought of providing some form of UI that can allow
outside cluster solutions to manage such slots.
Do you think we must come back to the allow_overwrite flag approach or another solution?
We can wait for a few days to see what others think on this matter.
--
With Regards,
Amit Kapila.
Hi Amit,
On Wed, 12 Nov 2025 at 05:22, Amit Kapila <amit.kapila16@gmail.com> wrote:
It is difficult to tell when this can happen but you detailed there is
a theoretical possibility of the same. If we had an in-core cluster
tool that manages nodes on its own which doesn't allow such scenarios
to happen then we could possibly say that using such a tool it is safe
to overwrite old primary's slots.
That's a lot of ifs, and none of them could be fulfilled in the foreseeable
future.
Situation you describe is impossible.
When there is a split-brain and someone drops and re-creating logical slots
with the same names on the old primary - such node can't be joined as a
standby without pg_rewind.
In its current state pg_rewind wipes the pg_replslot directory, and
therefore there will be no replication slots.
That is, if there is a logical replication slot with failover=true and
synced=false on a healthy standby, it could have happened only because the
old primary was shut down gracefully.
Regards,
--
Alexander Kukushkin
On Wed, Nov 12, 2025 at 12:58 PM Alexander Kukushkin
<cyberdemn@gmail.com> wrote:
On Wed, 12 Nov 2025 at 05:22, Amit Kapila <amit.kapila16@gmail.com> wrote:
It is difficult to tell when this can happen but you detailed there is
a theoretical possibility of the same. If we had an in-core cluster
tool that manages nodes on its own which doesn't allow such scenarios
to happen then we could possibly say that using such a tool it is safe
to overwrite old primary's slots.That's a lot of ifs, and none of them could be fulfilled in the foreseeable future.
Situation you describe is impossible.
When there is a split-brain and someone drops and re-creating logical slots with the same names on the old primary - such node can't be joined as a standby without pg_rewind.
In its current state pg_rewind wipes the pg_replslot directory, and therefore there will be no replication slots.
Say, the only operations that happened are slot-drop-recreate and or
some operations on unlogged tables. Why then pg_rewind is required?
--
With Regards,
Amit Kapila.
Hi Amit,
On Thu, 13 Nov 2025 at 09:42, Amit Kapila <amit.kapila16@gmail.com> wrote:
Say, the only operations that happened are slot-drop-recreate and or
some operations on unlogged tables. Why then pg_rewind is required?
Clean shut down always writes checkpoint record to WAL and updates
pg_control.
Regards,
--
Alexander Kukushkin
On Thu, Nov 13, 2025 at 2:37 PM Alexander Kukushkin <cyberdemn@gmail.com> wrote:
Hi Amit,
On Thu, 13 Nov 2025 at 09:42, Amit Kapila <amit.kapila16@gmail.com> wrote:
Say, the only operations that happened are slot-drop-recreate and or
some operations on unlogged tables. Why then pg_rewind is required?Clean shut down always writes checkpoint record to WAL and updates pg_control.
But the system can die/crash before shutdown.
--
With Regards,
Amit Kapila.
But the system can die/crash before shutdown.
You mean it will not write WAL?
When a logical replication slot is created we build a snapshot and also
write to WAL:
postgres=# select pg_current_wal_insert_lsn(); select
pg_create_logical_replication_slot('foo', 'pgoutput'); select
pg_current_wal_insert_lsn();
pg_current_wal_insert_lsn
---------------------------
0/37F96F8
(1 row)
pg_create_logical_replication_slot
------------------------------------
(foo,0/37F9730)
(1 row)
pg_current_wal_insert_lsn
---------------------------
0/37F9730
(1 row)
Only after that slot is marked as persistent.
Regards,
--
Alexander Kukushkin
On Thu, Nov 13, 2025 at 6:39 PM Alexander Kukushkin <cyberdemn@gmail.com> wrote:
But the system can die/crash before shutdown.
You mean it will not write WAL?
When a logical replication slot is created we build a snapshot and also write to WAL:
postgres=# select pg_current_wal_insert_lsn(); select pg_create_logical_replication_slot('foo', 'pgoutput'); select pg_current_wal_insert_lsn();
pg_current_wal_insert_lsn
---------------------------
0/37F96F8
(1 row)pg_create_logical_replication_slot
------------------------------------
(foo,0/37F9730)
(1 row)pg_current_wal_insert_lsn
---------------------------
0/37F9730
(1 row)Only after that slot is marked as persistent.
There can be a scenario where a replication slot is dropped and
recreated, and its WAL is also replicated to the standby. However,
before the new slot state can be synchronized via slotsync, the
primary crashes and the standby is promoted. Later, the user manually
reconfigures the old primary to follow the newly promoted standby (no
pg-rewind in play). I was wondering whether in such a case, would it
be a good idea to overwrite the newly created slot on old primary with
promoted-standby's synced slot (old one) by default? Thoughts?
thanks
Shveta
On Thu, Nov 13, 2025 at 7:16 PM shveta malik <shveta.malik@gmail.com> wrote:
On Thu, Nov 13, 2025 at 6:39 PM Alexander Kukushkin <cyberdemn@gmail.com> wrote:
But the system can die/crash before shutdown.
You mean it will not write WAL?
When a logical replication slot is created we build a snapshot and also write to WAL:
postgres=# select pg_current_wal_insert_lsn(); select pg_create_logical_replication_slot('foo', 'pgoutput'); select pg_current_wal_insert_lsn();
pg_current_wal_insert_lsn
---------------------------
0/37F96F8
(1 row)pg_create_logical_replication_slot
------------------------------------
(foo,0/37F9730)
(1 row)pg_current_wal_insert_lsn
---------------------------
0/37F9730
(1 row)Only after that slot is marked as persistent.
There can be a scenario where a replication slot is dropped and
recreated, and its WAL is also replicated to the standby. However,
before the new slot state can be synchronized via slotsync, the
primary crashes and the standby is promoted. Later, the user manually
reconfigures the old primary to follow the newly promoted standby (no
pg-rewind in play). I was wondering whether in such a case, would it
be a good idea to overwrite the newly created slot on old primary with
promoted-standby's synced slot (old one) by default? Thoughts?
I think it's an extremely rare or a mostly wrong operation that after
failover (i.e., the old primary didn't shutdown gracefully) users have
the old primar rejoin to the replication as the new standby without
pg_rewind. I guess that pg_rewind should practically be used unless
the primary server gracefully shutdowns (i.e., in switchover case). In
failover cases, pg_rewind launches the server in single-user mode to
run the crash recovery, advancing its LSN and cleaning all existing
replication slots after rewinding the server. So I think that the
reported issue doesn't happen in failover cases and we can focus on
failover cases.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Hi Masahiko,
On Fri, 14 Nov 2025 at 07:10, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
So I think that the
reported issue doesn't happen in failover cases and we can focus on
failover cases.
I think you wanted to say "focus on *switchover* cases"?
Regards,
--
Alexander Kukushkin
On Fri, Nov 14, 2025 at 12:42 AM Alexander Kukushkin
<cyberdemn@gmail.com> wrote:
Hi Masahiko,
On Fri, 14 Nov 2025 at 07:10, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
So I think that the
reported issue doesn't happen in failover cases and we can focus on
failover cases.I think you wanted to say "focus on *switchover* cases"?
Oops, yes, "we can focus on switchover cases" is what I wanted to say.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Fri, Nov 14, 2025 at 11:40 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Nov 13, 2025 at 7:16 PM shveta malik <shveta.malik@gmail.com> wrote:
On Thu, Nov 13, 2025 at 6:39 PM Alexander Kukushkin <cyberdemn@gmail.com> wrote:
But the system can die/crash before shutdown.
You mean it will not write WAL?
When a logical replication slot is created we build a snapshot and also write to WAL:
postgres=# select pg_current_wal_insert_lsn(); select pg_create_logical_replication_slot('foo', 'pgoutput'); select pg_current_wal_insert_lsn();
pg_current_wal_insert_lsn
---------------------------
0/37F96F8
(1 row)pg_create_logical_replication_slot
------------------------------------
(foo,0/37F9730)
(1 row)pg_current_wal_insert_lsn
---------------------------
0/37F9730
(1 row)Only after that slot is marked as persistent.
There can be a scenario where a replication slot is dropped and
recreated, and its WAL is also replicated to the standby. However,
before the new slot state can be synchronized via slotsync, the
primary crashes and the standby is promoted. Later, the user manually
reconfigures the old primary to follow the newly promoted standby (no
pg-rewind in play). I was wondering whether in such a case, would it
be a good idea to overwrite the newly created slot on old primary with
promoted-standby's synced slot (old one) by default? Thoughts?I think it's an extremely rare or a mostly wrong operation that after
failover (i.e., the old primary didn't shutdown gracefully) users have
the old primar rejoin to the replication as the new standby without
pg_rewind. I guess that pg_rewind should practically be used unless
the primary server gracefully shutdowns (i.e., in switchover case). In
failover cases, pg_rewind launches the server in single-user mode to
run the crash recovery, advancing its LSN and cleaning all existing
replication slots after rewinding the server. So I think that the
reported issue doesn't happen in failover cases and we can focus on
failover cases.
The point is quite fundamental, do you think we can sync to a
pre-existing slot with the same name and failover marked as true after
the first time the node joins a new primary? We don't provide any
switchover tools/utilities, so it doesn't appear straight-forward that
we can perform re-sync. If we would have a switchover tool, I think
one may have removed all existing slots before the old primary joins
the new primary because otherwise, there is always a chance that there
remain redundant slots which will prevent resource removal. Consider a
case where after switchover, the old primary decides to join a
different standby (new primary) than where slot-sync was earlier
happening. Now, it is possible that the old primary may have some
slots which should be removed.
--
With Regards,
Amit Kapila.
On Fri, Nov 14, 2025 at 2:39 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Nov 14, 2025 at 11:40 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Nov 13, 2025 at 7:16 PM shveta malik <shveta.malik@gmail.com> wrote:
On Thu, Nov 13, 2025 at 6:39 PM Alexander Kukushkin <cyberdemn@gmail.com> wrote:
But the system can die/crash before shutdown.
You mean it will not write WAL?
When a logical replication slot is created we build a snapshot and also write to WAL:
postgres=# select pg_current_wal_insert_lsn(); select pg_create_logical_replication_slot('foo', 'pgoutput'); select pg_current_wal_insert_lsn();
pg_current_wal_insert_lsn
---------------------------
0/37F96F8
(1 row)pg_create_logical_replication_slot
------------------------------------
(foo,0/37F9730)
(1 row)pg_current_wal_insert_lsn
---------------------------
0/37F9730
(1 row)Only after that slot is marked as persistent.
There can be a scenario where a replication slot is dropped and
recreated, and its WAL is also replicated to the standby. However,
before the new slot state can be synchronized via slotsync, the
primary crashes and the standby is promoted. Later, the user manually
reconfigures the old primary to follow the newly promoted standby (no
pg-rewind in play). I was wondering whether in such a case, would it
be a good idea to overwrite the newly created slot on old primary with
promoted-standby's synced slot (old one) by default? Thoughts?I think it's an extremely rare or a mostly wrong operation that after
failover (i.e., the old primary didn't shutdown gracefully) users have
the old primar rejoin to the replication as the new standby without
pg_rewind. I guess that pg_rewind should practically be used unless
the primary server gracefully shutdowns (i.e., in switchover case). In
failover cases, pg_rewind launches the server in single-user mode to
run the crash recovery, advancing its LSN and cleaning all existing
replication slots after rewinding the server. So I think that the
reported issue doesn't happen in failover cases and we can focus on
failover cases.The point is quite fundamental, do you think we can sync to a
pre-existing slot with the same name and failover marked as true after
the first time the node joins a new primary?
Given the current behavior that we cannot create a logical slot with
failover=true on the standby, it makes sense to me that we overwrite
the pre-existing slot (with synced=false and failover=true) on the old
primary by the slot (with synced=true and failover=true) on the new
primary if their names, plugin and other properties matches and the
pre-existing slot has lesser LSNs and XIDs than the one on the new
primary. But at the same time, we need to consider the possible future
changes that allow users to create a slot with failover=true also on
the standby.
Alexander pointed out[1]/messages/by-id/CAFh8B=kH4Nwd_69fWP4VxK9tjxxBUzdxEZLd+LCby9tbSMTcRA@mail.gmail.com that allowing to create a slot with
failover=true on the standby won't work with the current
implementation. I agree with his analysis, and I guess we would need
more changes than simply allowing it, regardless of accepting the
proposed change. We might need to introduce a replication slot origin
or a generation.
We don't provide any
switchover tools/utilities, so it doesn't appear straight-forward that
we can perform re-sync. If we would have a switchover tool, I think
one may have removed all existing slots before the old primary joins
the new primary because otherwise, there is always a chance that there
remain redundant slots which will prevent resource removal. Consider a
case where after switchover, the old primary decides to join a
different standby (new primary) than where slot-sync was earlier
happening. Now, it is possible that the old primary may have some
slots which should be removed.
Right, there are cases where users need to remove existing slots
before allowing the old primary to join the new primary. While
removing all slots would be the simplest solution that works in all
cases, IIUC the proposal here is to reuse pre-existing slots in
scenarios involving a switchover. The idea itself seems reasonable and
would help reduce the time spent in degraded operation mode (other
benefits too?).
Regards,
[1]: /messages/by-id/CAFh8B=kH4Nwd_69fWP4VxK9tjxxBUzdxEZLd+LCby9tbSMTcRA@mail.gmail.com
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Hi Masahiko,
On Fri, 14 Nov 2025 at 23:32, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Given the current behavior that we cannot create a logical slot with
failover=true on the standby, it makes sense to me that we overwrite
the pre-existing slot (with synced=false and failover=true) on the old
primary by the slot (with synced=true and failover=true) on the new
primary if their names, plugin and other properties matches and the
pre-existing slot has lesser LSNs and XIDs than the one on the new
primary.
From one side the idea to have additional checks looks reasonable, but if I
look at existing update_local_synced_slot() function, I find the following:
if (remote_dbid != slot->data.database ||
remote_slot->two_phase != slot->data.two_phase ||
remote_slot->failover != slot->data.failover ||
strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) != 0 ||
remote_slot->two_phase_at != slot->data.two_phase_at)
{
NameData plugin_name;
/* Avoid expensive operations while holding a spinlock. */
namestrcpy(&plugin_name, remote_slot->plugin);
SpinLockAcquire(&slot->mutex);
slot->data.plugin = plugin_name;
slot->data.database = remote_dbid;
slot->data.two_phase = remote_slot->two_phase;
slot->data.two_phase_at = remote_slot->two_phase_at;
slot->data.failover = remote_slot->failover;
SpinLockRelease(&slot->mutex);
That is, if some synced slot properties on standby don't match with the
primary we simply overwrite them.
I guess this is necessary because synchronization happens only
periodically, and between two runs a slot on the primary might have been
recreated with different properties.
Do we really need to have additional checks to flip a synced flag?
Regards,
--
Alexander Kukushkin
On Sat, Nov 15, 2025 at 4:02 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Nov 14, 2025 at 2:39 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
The point is quite fundamental, do you think we can sync to a
pre-existing slot with the same name and failover marked as true after
the first time the node joins a new primary?Given the current behavior that we cannot create a logical slot with
failover=true on the standby, it makes sense to me that we overwrite
the pre-existing slot (with synced=false and failover=true) on the old
primary by the slot (with synced=true and failover=true) on the new
primary if their names, plugin and other properties matches and the
pre-existing slot has lesser LSNs and XIDs than the one on the new
primary. But at the same time, we need to consider the possible future
changes that allow users to create a slot with failover=true also on
the standby.Alexander pointed out[1] that allowing to create a slot with
failover=true on the standby won't work with the current
implementation. I agree with his analysis, and I guess we would need
more changes than simply allowing it, regardless of accepting the
proposed change. We might need to introduce a replication slot origin
or a generation.
AFAICS, the email you pointed out wrote about use cases, not the
actual code implementation. We can discuss use cases if we want to
pursue that implementation, but the reason why we decided not to allow
it was for the cases where users try to configure cascaded standbys to
also try to sync slots from the first standby that are already being
synced from the primary. There are quite a few technical challenges in
supporting that, like how to make sure primary waits even for cascaded
standbys before sending the changes to logical subscribers.
OTOH, for the cases where there is a totally different logical slot on
standby (not present on primary) with failover=true, we can allow it
to be synced from standby-1 to a cascaded standby, though we need some
way to distinguish those cases. For example, during sync on cascaded
standby, we can ensure that the slot being synced is not a sync-slot
(failover=true and sync=true).
--
With Regards,
Amit Kapila.
On Tue, Nov 18, 2025 at 1:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Nov 15, 2025 at 4:02 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Nov 14, 2025 at 2:39 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
The point is quite fundamental, do you think we can sync to a
pre-existing slot with the same name and failover marked as true after
the first time the node joins a new primary?Given the current behavior that we cannot create a logical slot with
failover=true on the standby, it makes sense to me that we overwrite
the pre-existing slot (with synced=false and failover=true) on the old
primary by the slot (with synced=true and failover=true) on the new
primary if their names, plugin and other properties matches and the
pre-existing slot has lesser LSNs and XIDs than the one on the new
primary. But at the same time, we need to consider the possible future
changes that allow users to create a slot with failover=true also on
the standby.Alexander pointed out[1] that allowing to create a slot with
failover=true on the standby won't work with the current
implementation. I agree with his analysis, and I guess we would need
more changes than simply allowing it, regardless of accepting the
proposed change. We might need to introduce a replication slot origin
or a generation.AFAICS, the email you pointed out wrote about use cases, not the
actual code implementation. We can discuss use cases if we want to
pursue that implementation, but the reason why we decided not to allow
it was for the cases where users try to configure cascaded standbys to
also try to sync slots from the first standby that are already being
synced from the primary. There are quite a few technical challenges in
supporting that, like how to make sure primary waits even for cascaded
standbys before sending the changes to logical subscribers.
Right. My point is that these are two independent issues. The fact
that creating a slot with failover=true directly on a standby is
difficult (due to the cascaded-standby cases you mentioned) does not,
by itself, justify allowing us to overwrite an existing slot with
failover=true and synced=false during slot synchronization.
OTOH, for the cases where there is a totally different logical slot on
standby (not present on primary) with failover=true, we can allow it
to be synced from standby-1 to a cascaded standby, though we need some
way to distinguish those cases. For example, during sync on cascaded
standby, we can ensure that the slot being synced is not a sync-slot
(failover=true and sync=true).
Yes. We need some way to distinguish those slots, otherwise if users
create a slot with the same name on the primary, the slot on standby-1
(a cascading standby) could be overwritten. I think we would need some
additional metadata per slot to support that safely.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Wed, Nov 19, 2025 at 6:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Nov 18, 2025 at 1:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Nov 15, 2025 at 4:02 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Nov 14, 2025 at 2:39 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
The point is quite fundamental, do you think we can sync to a
pre-existing slot with the same name and failover marked as true after
the first time the node joins a new primary?Given the current behavior that we cannot create a logical slot with
failover=true on the standby, it makes sense to me that we overwrite
the pre-existing slot (with synced=false and failover=true) on the old
primary by the slot (with synced=true and failover=true) on the new
primary if their names, plugin and other properties matches and the
pre-existing slot has lesser LSNs and XIDs than the one on the new
primary. But at the same time, we need to consider the possible future
changes that allow users to create a slot with failover=true also on
the standby.Alexander pointed out[1] that allowing to create a slot with
failover=true on the standby won't work with the current
implementation. I agree with his analysis, and I guess we would need
more changes than simply allowing it, regardless of accepting the
proposed change. We might need to introduce a replication slot origin
or a generation.AFAICS, the email you pointed out wrote about use cases, not the
actual code implementation. We can discuss use cases if we want to
pursue that implementation, but the reason why we decided not to allow
it was for the cases where users try to configure cascaded standbys to
also try to sync slots from the first standby that are already being
synced from the primary. There are quite a few technical challenges in
supporting that, like how to make sure primary waits even for cascaded
standbys before sending the changes to logical subscribers.Right. My point is that these are two independent issues. The fact
that creating a slot with failover=true directly on a standby is
difficult (due to the cascaded-standby cases you mentioned) does not,
by itself, justify allowing us to overwrite an existing slot with
failover=true and synced=false during slot synchronization.OTOH, for the cases where there is a totally different logical slot on
standby (not present on primary) with failover=true, we can allow it
to be synced from standby-1 to a cascaded standby, though we need some
way to distinguish those cases. For example, during sync on cascaded
standby, we can ensure that the slot being synced is not a sync-slot
(failover=true and sync=true).Yes. We need some way to distinguish those slots, otherwise if users
create a slot with the same name on the primary, the slot on standby-1
(a cascading standby) could be overwritten. I think we would need some
additional metadata per slot to support that safely.
It shouldn't be overwritten as of current HEAD because we give ERROR
if such a slot exists on standby but if we allow overwriting it as
being discussed here then we would require additional metadata per
slot to distinguish this case.
--
With Regards,
Amit Kapila.
The use of the *synced *flag could perhaps be a first step towards
distinguishing between a synchronization slot on a standby and a failover
slot created on a standby.
On the primary, *synced* should always be false.
On a secondary, actually *synced* remains true even if the parameter
sync_replication_slots parameter is set to 'off'.
Regards,
Fabrice
On Wed, Nov 19, 2025 at 1:56 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
Show quoted text
On Tue, Nov 18, 2025 at 1:30 AM Amit Kapila <amit.kapila16@gmail.com>
wrote:On Sat, Nov 15, 2025 at 4:02 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
On Fri, Nov 14, 2025 at 2:39 AM Amit Kapila <amit.kapila16@gmail.com>
wrote:
The point is quite fundamental, do you think we can sync to a
pre-existing slot with the same name and failover marked as trueafter
the first time the node joins a new primary?
Given the current behavior that we cannot create a logical slot with
failover=true on the standby, it makes sense to me that we overwrite
the pre-existing slot (with synced=false and failover=true) on the old
primary by the slot (with synced=true and failover=true) on the new
primary if their names, plugin and other properties matches and the
pre-existing slot has lesser LSNs and XIDs than the one on the new
primary. But at the same time, we need to consider the possible future
changes that allow users to create a slot with failover=true also on
the standby.Alexander pointed out[1] that allowing to create a slot with
failover=true on the standby won't work with the current
implementation. I agree with his analysis, and I guess we would need
more changes than simply allowing it, regardless of accepting the
proposed change. We might need to introduce a replication slot origin
or a generation.AFAICS, the email you pointed out wrote about use cases, not the
actual code implementation. We can discuss use cases if we want to
pursue that implementation, but the reason why we decided not to allow
it was for the cases where users try to configure cascaded standbys to
also try to sync slots from the first standby that are already being
synced from the primary. There are quite a few technical challenges in
supporting that, like how to make sure primary waits even for cascaded
standbys before sending the changes to logical subscribers.Right. My point is that these are two independent issues. The fact
that creating a slot with failover=true directly on a standby is
difficult (due to the cascaded-standby cases you mentioned) does not,
by itself, justify allowing us to overwrite an existing slot with
failover=true and synced=false during slot synchronization.OTOH, for the cases where there is a totally different logical slot on
standby (not present on primary) with failover=true, we can allow it
to be synced from standby-1 to a cascaded standby, though we need some
way to distinguish those cases. For example, during sync on cascaded
standby, we can ensure that the slot being synced is not a sync-slot
(failover=true and sync=true).Yes. We need some way to distinguish those slots, otherwise if users
create a slot with the same name on the primary, the slot on standby-1
(a cascading standby) could be overwritten. I think we would need some
additional metadata per slot to support that safely.Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Tue, Nov 18, 2025 at 10:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Nov 19, 2025 at 6:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Nov 18, 2025 at 1:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Nov 15, 2025 at 4:02 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Nov 14, 2025 at 2:39 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
The point is quite fundamental, do you think we can sync to a
pre-existing slot with the same name and failover marked as true after
the first time the node joins a new primary?Given the current behavior that we cannot create a logical slot with
failover=true on the standby, it makes sense to me that we overwrite
the pre-existing slot (with synced=false and failover=true) on the old
primary by the slot (with synced=true and failover=true) on the new
primary if their names, plugin and other properties matches and the
pre-existing slot has lesser LSNs and XIDs than the one on the new
primary. But at the same time, we need to consider the possible future
changes that allow users to create a slot with failover=true also on
the standby.Alexander pointed out[1] that allowing to create a slot with
failover=true on the standby won't work with the current
implementation. I agree with his analysis, and I guess we would need
more changes than simply allowing it, regardless of accepting the
proposed change. We might need to introduce a replication slot origin
or a generation.AFAICS, the email you pointed out wrote about use cases, not the
actual code implementation. We can discuss use cases if we want to
pursue that implementation, but the reason why we decided not to allow
it was for the cases where users try to configure cascaded standbys to
also try to sync slots from the first standby that are already being
synced from the primary. There are quite a few technical challenges in
supporting that, like how to make sure primary waits even for cascaded
standbys before sending the changes to logical subscribers.Right. My point is that these are two independent issues. The fact
that creating a slot with failover=true directly on a standby is
difficult (due to the cascaded-standby cases you mentioned) does not,
by itself, justify allowing us to overwrite an existing slot with
failover=true and synced=false during slot synchronization.OTOH, for the cases where there is a totally different logical slot on
standby (not present on primary) with failover=true, we can allow it
to be synced from standby-1 to a cascaded standby, though we need some
way to distinguish those cases. For example, during sync on cascaded
standby, we can ensure that the slot being synced is not a sync-slot
(failover=true and sync=true).Yes. We need some way to distinguish those slots, otherwise if users
create a slot with the same name on the primary, the slot on standby-1
(a cascading standby) could be overwritten. I think we would need some
additional metadata per slot to support that safely.It shouldn't be overwritten as of current HEAD because we give ERROR
if such a slot exists on standby but if we allow overwriting it as
being discussed here then we would require additional metadata per
slot to distinguish this case.
I think we need to clarify that suppose the standby has a slot with
failover=true and synced=false and the primary has the slot with the
same name, failover=true, and synced=true, what problem could occur if
we allow overwriting the slot on the standby by the one on the
primary. I see there are some cases where the old primary needs to
delete pre-existing slots before joining the new primary, but should
we make such operation mandatory in switchover cases? It might be
worth considering whether such additional metadata per slot is
required before allowing overwriting the slot.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
I think we need to clarify that suppose the standby has a slot with
failover=true and synced=false and the primary has the slot with the
same name, failover=true, and synced=true...
I'm not sure to understand the semantics related to the `synced` flag but
why `synced` flag can be true on a primary instance? AFAICS if
`synced=true` then it means taht the slot is inactive and it is
synchronized with a slot on a remote instance. On a primary, what is the
meaning of having the flag synced set to true?
There's already an open thread dealing with this issue [1]PostgreSQL: Clear logical slot's 'synced' flag on promotion of standby </messages/by-id/CAJpy0uBJqTQdq+UDW55oK2bhxMig=FaEYFV=+ZaKmtmgNWwsHw@mail.gmail.com>.
The problem I see is being able to distinguish between 2 situations:
1) A failover slot has been created on a standby (failover=true and
synced=false) in a context of cascading standby. In this case the slot must
not be deleted.
2) A former primary has a slot (failover=true and synced=false) that must
be resynchronized and that can be overwritten.
Why not to use a slot's metadata (allow_overwrite) to treat these two
situations separately.
Regards,
Fabrice
[1]: PostgreSQL: Clear logical slot's 'synced' flag on promotion of standby </messages/by-id/CAJpy0uBJqTQdq+UDW55oK2bhxMig=FaEYFV=+ZaKmtmgNWwsHw@mail.gmail.com>
PostgreSQL: Clear logical slot's 'synced' flag on promotion of standby
</messages/by-id/CAJpy0uBJqTQdq+UDW55oK2bhxMig=FaEYFV=+ZaKmtmgNWwsHw@mail.gmail.com>
On Wed, Nov 19, 2025 at 10:44 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
Show quoted text
On Tue, Nov 18, 2025 at 10:27 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:On Wed, Nov 19, 2025 at 6:26 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
On Tue, Nov 18, 2025 at 1:30 AM Amit Kapila <amit.kapila16@gmail.com>
wrote:
On Sat, Nov 15, 2025 at 4:02 AM Masahiko Sawada <
sawada.mshk@gmail.com> wrote:
On Fri, Nov 14, 2025 at 2:39 AM Amit Kapila <
amit.kapila16@gmail.com> wrote:
The point is quite fundamental, do you think we can sync to a
pre-existing slot with the same name and failover marked as trueafter
the first time the node joins a new primary?
Given the current behavior that we cannot create a logical slot
with
failover=true on the standby, it makes sense to me that we
overwrite
the pre-existing slot (with synced=false and failover=true) on the
old
primary by the slot (with synced=true and failover=true) on the new
primary if their names, plugin and other properties matches and the
pre-existing slot has lesser LSNs and XIDs than the one on the new
primary. But at the same time, we need to consider the possiblefuture
changes that allow users to create a slot with failover=true also
on
the standby.
Alexander pointed out[1] that allowing to create a slot with
failover=true on the standby won't work with the current
implementation. I agree with his analysis, and I guess we wouldneed
more changes than simply allowing it, regardless of accepting the
proposed change. We might need to introduce a replication slotorigin
or a generation.
AFAICS, the email you pointed out wrote about use cases, not the
actual code implementation. We can discuss use cases if we want to
pursue that implementation, but the reason why we decided not toallow
it was for the cases where users try to configure cascaded standbys
to
also try to sync slots from the first standby that are already being
synced from the primary. There are quite a few technical challengesin
supporting that, like how to make sure primary waits even for
cascaded
standbys before sending the changes to logical subscribers.
Right. My point is that these are two independent issues. The fact
that creating a slot with failover=true directly on a standby is
difficult (due to the cascaded-standby cases you mentioned) does not,
by itself, justify allowing us to overwrite an existing slot with
failover=true and synced=false during slot synchronization.OTOH, for the cases where there is a totally different logical slot
on
standby (not present on primary) with failover=true, we can allow it
to be synced from standby-1 to a cascaded standby, though we needsome
way to distinguish those cases. For example, during sync on cascaded
standby, we can ensure that the slot being synced is not a sync-slot
(failover=true and sync=true).Yes. We need some way to distinguish those slots, otherwise if users
create a slot with the same name on the primary, the slot on standby-1
(a cascading standby) could be overwritten. I think we would need some
additional metadata per slot to support that safely.It shouldn't be overwritten as of current HEAD because we give ERROR
if such a slot exists on standby but if we allow overwriting it as
being discussed here then we would require additional metadata per
slot to distinguish this case.I think we need to clarify that suppose the standby has a slot with
failover=true and synced=false and the primary has the slot with the
same name, failover=true, and synced=true, what problem could occur if
we allow overwriting the slot on the standby by the one on the
primary. I see there are some cases where the old primary needs to
delete pre-existing slots before joining the new primary, but should
we make such operation mandatory in switchover cases? It might be
worth considering whether such additional metadata per slot is
required before allowing overwriting the slot.Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Thu, Nov 20, 2025 at 6:26 AM Fabrice Chapuis <fabrice636861@gmail.com> wrote:
I think we need to clarify that suppose the standby has a slot with
failover=true and synced=false and the primary has the slot with the
same name, failover=true, and synced=true...I'm not sure to understand the semantics related to the `synced` flag but why `synced` flag can be true on a primary instance? AFAICS if `synced=true` then it means taht the slot is inactive and it is synchronized with a slot on a remote instance. On a primary, what is the meaning of having the flag synced set to true?
I think that the synced can be true on the primary if the slot was
previously synced and the instance is now working as the primary. But
the synced flag being true doesn't mean anything on the primary. It
works only on the standby.
There's already an open thread dealing with this issue [1].
The problem I see is being able to distinguish between 2 situations:
1) A failover slot has been created on a standby (failover=true and synced=false) in a context of cascading standby. In this case the slot must not be deleted.
2) A former primary has a slot (failover=true and synced=false) that must be resynchronized and that can be overwritten.
Right.
Why not to use a slot's metadata (allow_overwrite) to treat these two situations separately.
I'm not sure that the allow_overwrite idea is the best approach. For
example, suppose that in a cascading replication setup (node-1 ->
node2 -> node3) we create a failover slot on node2 (failover=true,
synced=false, and allow_overwrite=false), the slot is synchronized to
the node3 (failover=true, synced=true, allow_overwrite=false). If we
do a switchover between node2 and node3, node3 joins the primary,
node1, and node2 now joins node3 as a cascaded standby (i.e.,
replication setup is now node1 -> node3 -> node2). I guess that in
this case the slot on node2 wants to be overwritten by the one on the
node3, but it's not allowed because the slot on node2 has
allow_overwrite=false.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Fri, Nov 21, 2025 at 2:05 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Nov 20, 2025 at 6:26 AM Fabrice Chapuis <fabrice636861@gmail.com> wrote:
The problem I see is being able to distinguish between 2 situations:
1) A failover slot has been created on a standby (failover=true and synced=false) in a context of cascading standby. In this case the slot must not be deleted.
2) A former primary has a slot (failover=true and synced=false) that must be resynchronized and that can be overwritten.Right.
Why not to use a slot's metadata (allow_overwrite) to treat these two situations separately.
I'm not sure that the allow_overwrite idea is the best approach. For
example, suppose that in a cascading replication setup (node-1 ->
node2 -> node3) we create a failover slot on node2 (failover=true,
synced=false, and allow_overwrite=false), the slot is synchronized to
the node3 (failover=true, synced=true, allow_overwrite=false). If we
do a switchover between node2 and node3, node3 joins the primary,
node1, and node2 now joins node3 as a cascaded standby (i.e.,
replication setup is now node1 -> node3 -> node2). I guess that in
this case the slot on node2 wants to be overwritten by the one on the
node3, but it's not allowed because the slot on node2 has
allow_overwrite=false.
IIRC, we discussed the pg_alter_replication_slot() API which will
allow us to alter this new property (allow_overwrite or whatever we
name it) of slots and some existing properties like two_phase, etc.
--
With Regards,
Amit Kapila.
Hi Masahiko,
I agree with your analysis. But why not to take into account the parameter
synchronized_standby_slots (garantee the slot on the target has received
wal before playing decoded changes).
I take your setup.
node1 => node2 => node3
node1 is a primary
node2 are standby
slot1: is a failover slot created on node2
node2: synchronized_standby_slots=node3,primary_conninfo=node1,slot1:
failover=true
node3: synchronized_standby_slots=node2,primary_conninfo=node2,slot1:
failover=true
Case 1) switchover between node2 and node3
node1 => node3 => node2
node2:
synchronized_standby_slots=node3,primary_conninfo=node3,slot1:failover=true
=> the slot could be overwritten, node2 connect now to node 3 and it is the
same node which is informed in synchronized_standby_slots
Case 2) Restart of node2
node1 => node2 => node3
node2:
synchronized_standby_slots=node3,primary_conninfo=node1,slot1:failover=true
=> slot could not be overwritten because primary_conninfo <>
synchronized_standby_slots
Regards,
Fabrice
On Thu, Nov 20, 2025 at 9:35 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
Show quoted text
On Thu, Nov 20, 2025 at 6:26 AM Fabrice Chapuis <fabrice636861@gmail.com>
wrote:I think we need to clarify that suppose the standby has a slot with
failover=true and synced=false and the primary has the slot with the
same name, failover=true, and synced=true...I'm not sure to understand the semantics related to the `synced` flag
but why `synced` flag can be true on a primary instance? AFAICS if
`synced=true` then it means taht the slot is inactive and it is
synchronized with a slot on a remote instance. On a primary, what is the
meaning of having the flag synced set to true?I think that the synced can be true on the primary if the slot was
previously synced and the instance is now working as the primary. But
the synced flag being true doesn't mean anything on the primary. It
works only on the standby.There's already an open thread dealing with this issue [1].
The problem I see is being able to distinguish between 2 situations:
1) A failover slot has been created on a standby (failover=true andsynced=false) in a context of cascading standby. In this case the slot must
not be deleted.2) A former primary has a slot (failover=true and synced=false) that
must be resynchronized and that can be overwritten.
Right.
Why not to use a slot's metadata (allow_overwrite) to treat these two
situations separately.
I'm not sure that the allow_overwrite idea is the best approach. For
example, suppose that in a cascading replication setup (node-1 ->
node2 -> node3) we create a failover slot on node2 (failover=true,
synced=false, and allow_overwrite=false), the slot is synchronized to
the node3 (failover=true, synced=true, allow_overwrite=false). If we
do a switchover between node2 and node3, node3 joins the primary,
node1, and node2 now joins node3 as a cascaded standby (i.e.,
replication setup is now node1 -> node3 -> node2). I guess that in
this case the slot on node2 wants to be overwritten by the one on the
node3, but it's not allowed because the slot on node2 has
allow_overwrite=false.Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com