failover logical replication slots

Started by Fabrice Chapuis9 months ago16 messages

fabrice636861@gmail.com

9 months ago

I'm working with logical replication in a PostgreSQL 17 setup, and I'm
exploring the new option to make replication slots failover safe in a
highly available environment
using physical standby nodes managed by Patroni.

After a switchover, I encounter an error message in the PostgreSQL logs and
observe unexpected behavior.
Here are the different steps I followed:

1) Setting up a new subscription

Logical replication is established between two databases on the same
PostgreSQL instance.

A logical replication slot is created on the source database:

SELECT pg_create_logical_replication_slot('sub_test', 'pgoutput', false,
false, true);

A subscription is then configured on the target database:

CREATE SUBSCRIPTION sub_test CONNECTION 'dbname=test host=localhost
port=5432 user=user_test'
PUBLICATION pub_test WITH (create_slot=false, copy_data=false,
failover=true);

The logical replication slot is active and in failover mode.

\dRs+
List of subscriptions
+-[ RECORD 1
]-------+----------------------------------------------------------------------------+
| Name               | sub_test
                       |
| Owner              | postgres
                       |
| Enabled            | t
                      |
| Publication        | {pub_test}
                       |
| Binary             | f
                      |
| Streaming          | off
                      |
| Two-phase commit   | d
                      |
| Disable on error   | f
                      |
| Origin             | any
                      |
| Password required  | t
                      |
| Run as owner?      | f
                      |
| Failover           | t
                      |
| Synchronous commit | off
                      |
| Conninfo           | dbname=test host=localhost port=5432 user=user_test
                      |
| Skip LSN           | 0/0
                      |
+--------------------+----------------------------------------------------------------------------+

2) Starting the physical standby

A logical replication slot is successfully created on the standby

select * from pg_replication_slots where slot_type = 'logical';
+-[ RECORD 1 ]--------+-------------------------------+
| slot_name           | sub_test               |
| plugin              | pgoutput                      |
| slot_type           | logical                       |
| datoid              | 58458                         |
| database            | test                      |
| temporary           | f                             |
| active              | f                             |
| active_pid          |                               |
| xmin                |                               |
| catalog_xmin        | 1976743                       |
| restart_lsn         | 8/5F000028                    |
| confirmed_flush_lsn | 8/5F000060                    |
| wal_status          | reserved                      |
| safe_wal_size       |                               |
| two_phase           | f                             |
| inactive_since      | 2025-06-10 16:30:38.633723+02 |
| conflicting         | f                             |
| invalidation_reason |                               |
| failover            | t                             |
| synced              | t                             |
+---------------------+-------------------------------+

3) Cluster switchover

The switchover is initiated using the Patroni command:

patronictl switchover

The operation completes successfully, and roles are reversed in the cluster.

4) Issue encountered
After the switchover, an error appears in the PostgreSQL logs:

2025-06-10 16:40:58.996 CEST [739829]: [1-1] user=,db=,client=,application=
LOG: slot sync worker started
2025-06-10 16:40:59.011 CEST [739829]: [2-1] user=,db=,client=,application=
ERROR: exiting from slot synchronization because same name slot "sub_test"
already exists on the standby

the slot on the new standby in not in sync mode.

select * from pg_replication_slots where slot_type = 'logical';

+-[ RECORD 1 ]--------+-------------------------------+
| slot_name           | sub_test                |
| plugin              | pgoutput                      |
| slot_type           | logical                       |
| datoid              | 58458                         |
| database            | test                      |
| temporary           | f                             |
| active              | f                             |
| active_pid          |                               |
| xmin                |                               |
| catalog_xmin        | 1976743                       |
| restart_lsn         | 8/5F000080                    |
| confirmed_flush_lsn | 8/5F000130                    |
| wal_status          | reserved                      |
| safe_wal_size       |                               |
| two_phase           | f                             |
| inactive_since      | 2025-06-10 16:33:49.573016+02 |
| conflicting         | f                             |
| invalidation_reason |                               |
| failover            | t                             |
| synced              | f                             |
+---------------------+-------------------------------+

In the source code (slotsync.c), the check for the synced flag triggers an
error:

/* Search for the named slot */
if ((slot = SearchNamedReplicationSlot(remote_slot->name, true))) {
bool synced;

SpinLockAcquire(&slot->mutex);
synced = slot->data.synced;
SpinLockRelease(&slot->mutex);

/* A user-created slot with the same name exists → raise ERROR */
if (!synced)
ereport(ERROR,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("exiting from slot synchronization because same"
" name slot \"%s\" already exists on the standby",
remote_slot->name));
}

5) Dropping the slot

If the slot on the standby is deleted, it is then recreated with synced =
true, and at that point, it successfully resynchronizes with the primary
slot. Everything works correctly.

Question:
Why does the synced flag fail to change to true, even though
sync_replication_slots is enabled (on)?

Thanks for helping

Fabrice

Zhijie Hou (Fujitsu)

houzj.fnst@fujitsu.com

9 months ago

In reply to: Fabrice Chapuis (#1)

RE: failover logical replication slots

On Tue, Jun 10, 2025 at 11:46 PM Fabrice Chapuis wrote:

I'm working with logical replication in a PostgreSQL 17 setup, and I'm
exploring the new option to make replication slots failover safe in a highly
available environment using physical standby nodes managed by Patroni.

After a switchover, I encounter an error message in the PostgreSQL logs and observe unexpected behavior.
Here are the different steps I followed:

1) Setting up a new subscription

Logical replication is established between two databases on the same PostgreSQL instance.

A logical replication slot is created on the source database:

SELECT pg_create_logical_replication_slot('sub_test', 'pgoutput', false, false, true);

A subscription is then configured on the target database:

CREATE SUBSCRIPTION sub_test CONNECTION 'dbname=test host=localhost port=5432 user=user_test'
PUBLICATION pub_test WITH (create_slot=false, copy_data=false, failover=true);

The logical replication slot is active and in failover mode.

2) Starting the physical standby

A logical replication slot is successfully created on the standby

3) Cluster switchover

The switchover is initiated using the Patroni command:

patronictl switchover

The operation completes successfully, and roles are reversed in the cluster.
...
4) Issue encountered
After the switchover, an error appears in the PostgreSQL logs:

2025-06-10 16:40:58.996 CEST [739829]: [1-1] user=,db=,client=,application= LOG: slot sync worker started
2025-06-10 16:40:59.011 CEST [739829]: [2-1] user=,db=,client=,application= ERROR: exiting from slot synchronization because same name slot "sub_test" already exists on the standby
...
5) Dropping the slot

If the slot on the standby is deleted, it is then recreated with synced = true, and at that point, it successfully resynchronizes with the primary slot. Everything works correctly.

Question:
Why does the synced flag fail to change to true, even though sync_replication_slots is enabled (on)?

Thank you for reporting this. This behavior is expected because overwriting
existing slots on standbys is not permitted for now. Doing so poses a risk of
rendering slots created by users for other purposes unusable.

However, if needed, we could permit overwriting when the existing slot has
failover=true, given that enabling failover for slots on standbys is currently
disallowed, but this assumption might change in the future if we support
enabling failover to allow slot syncing to cascading standbys. Alternatively,
we could introduce options, such as a GUC, to control whether to overwrite
existing slots though not sure if it's worth it.

From a database user's perspective, it's necessary to clean up any leftover
slots on a new standby following a switchover, regardless of whether the
failover slot feature is supported. Because those leftover slots could lead to
excessive WAL accumulation.

Best Regards,
Hou zj

Dilip Kumar

dilipbalaut@gmail.com

9 months ago

In reply to: Zhijie Hou (Fujitsu) (#2)

Re: failover logical replication slots

On Wed, Jun 11, 2025 at 10:18 AM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:

On Tue, Jun 10, 2025 at 11:46 PM Fabrice Chapuis wrote:

I'm working with logical replication in a PostgreSQL 17 setup, and I'm
exploring the new option to make replication slots failover safe in a highly
available environment using physical standby nodes managed by Patroni.

After a switchover, I encounter an error message in the PostgreSQL logs and observe unexpected behavior.
Here are the different steps I followed:

1) Setting up a new subscription

Logical replication is established between two databases on the same PostgreSQL instance.

A logical replication slot is created on the source database:

SELECT pg_create_logical_replication_slot('sub_test', 'pgoutput', false, false, true);

A subscription is then configured on the target database:

CREATE SUBSCRIPTION sub_test CONNECTION 'dbname=test host=localhost port=5432 user=user_test'
PUBLICATION pub_test WITH (create_slot=false, copy_data=false, failover=true);

The logical replication slot is active and in failover mode.

2) Starting the physical standby

A logical replication slot is successfully created on the standby

3) Cluster switchover

The switchover is initiated using the Patroni command:

patronictl switchover

The operation completes successfully, and roles are reversed in the cluster.
...
4) Issue encountered
After the switchover, an error appears in the PostgreSQL logs:

2025-06-10 16:40:58.996 CEST [739829]: [1-1] user=,db=,client=,application= LOG: slot sync worker started
2025-06-10 16:40:59.011 CEST [739829]: [2-1] user=,db=,client=,application= ERROR: exiting from slot synchronization because same name slot "sub_test" already exists on the standby
...
5) Dropping the slot

If the slot on the standby is deleted, it is then recreated with synced = true, and at that point, it successfully resynchronizes with the primary slot. Everything works correctly.

Question:
Why does the synced flag fail to change to true, even though sync_replication_slots is enabled (on)?

Thank you for reporting this. This behavior is expected because overwriting
existing slots on standbys is not permitted for now. Doing so poses a risk of
rendering slots created by users for other purposes unusable.

However, if needed, we could permit overwriting when the existing slot has
failover=true, given that enabling failover for slots on standbys is currently
disallowed, but this assumption might change in the future if we support
enabling failover to allow slot syncing to cascading standbys. Alternatively,
we could introduce options, such as a GUC, to control whether to overwrite
existing slots though not sure if it's worth it.

From a database user's perspective, it's necessary to clean up any leftover
slots on a new standby following a switchover, regardless of whether the
failover slot feature is supported. Because those leftover slots could lead to
excessive WAL accumulation.

It's logical for users to clean up existing replication slots on a new
standby. Therefore, the current behavior might not be overly
inconvenient. However, providing a GUC to force slot overwrites could
streamline switchovers, allowing users to do nothing post-switchover.
I'm curious to hear others' thoughts on this.

--
Regards,
Dilip Kumar
Google

Fabrice Chapuis

fabrice636861@gmail.com

9 months ago

In reply to: Zhijie Hou (Fujitsu) (#2)

Re: failover logical replication slots

Thanks for your reply.
The problem I see is that after creating a new subscription, we have:

1) if a failover occurs, on the new primary node, the failover and sync
flags are both set to true, so there's no problem.

2) when the old node returns as a secondary in the cluster, the failover
flag is set to true and the sync flag is set to false then
the error message is generated: ERROR: exiting from slot synchronization
because same name slot "sub_test" already exists on the standby

Why not change the value of the synced flag when the standby is joining the
cluster ? If the slot on the primary node has the same name as the slot on
the secondary node and the failover flag is set to true,

if ((slot = SearchNamedReplicationSlot(remote_slot->name, true))) {
*slot->data.synced = true*
...
Thanks for your feedback

On Wed, Jun 11, 2025 at 6:48 AM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
wrote:

Show quoted text

On Tue, Jun 10, 2025 at 11:46 PM Fabrice Chapuis wrote:

I'm working with logical replication in a PostgreSQL 17 setup, and I'm
exploring the new option to make replication slots failover safe in a

highly

available environment using physical standby nodes managed by Patroni.

After a switchover, I encounter an error message in the PostgreSQL logs

and observe unexpected behavior.

Here are the different steps I followed:

1) Setting up a new subscription

Logical replication is established between two databases on the same

PostgreSQL instance.

A logical replication slot is created on the source database:

SELECT pg_create_logical_replication_slot('sub_test', 'pgoutput', false,

false, true);

A subscription is then configured on the target database:

CREATE SUBSCRIPTION sub_test CONNECTION 'dbname=test host=localhost

port=5432 user=user_test'

PUBLICATION pub_test WITH (create_slot=false, copy_data=false,

failover=true);

The logical replication slot is active and in failover mode.

2) Starting the physical standby

A logical replication slot is successfully created on the standby

3) Cluster switchover

The switchover is initiated using the Patroni command:

patronictl switchover

The operation completes successfully, and roles are reversed in the

cluster.

...
4) Issue encountered
After the switchover, an error appears in the PostgreSQL logs:

2025-06-10 16:40:58.996 CEST [739829]: [1-1]

user=,db=,client=,application= LOG: slot sync worker started

2025-06-10 16:40:59.011 CEST [739829]: [2-1]

user=,db=,client=,application= ERROR: exiting from slot synchronization
because same name slot "sub_test" already exists on the standby

...
5) Dropping the slot

If the slot on the standby is deleted, it is then recreated with synced

= true, and at that point, it successfully resynchronizes with the primary
slot. Everything works correctly.

Question:
Why does the synced flag fail to change to true, even though

sync_replication_slots is enabled (on)?

Thank you for reporting this. This behavior is expected because overwriting
existing slots on standbys is not permitted for now. Doing so poses a risk
of
rendering slots created by users for other purposes unusable.

However, if needed, we could permit overwriting when the existing slot has
failover=true, given that enabling failover for slots on standbys is
currently
disallowed, but this assumption might change in the future if we support
enabling failover to allow slot syncing to cascading standbys.
Alternatively,
we could introduce options, such as a GUC, to control whether to overwrite
existing slots though not sure if it's worth it.

From a database user's perspective, it's necessary to clean up any leftover
slots on a new standby following a switchover, regardless of whether the
failover slot feature is supported. Because those leftover slots could
lead to
excessive WAL accumulation.

Best Regards,
Hou zj

Amit Kapila

amit.kapila16@gmail.com

9 months ago

In reply to: Fabrice Chapuis (#4)

Re: failover logical replication slots

On Wed, Jun 11, 2025 at 10:17 PM Fabrice Chapuis
<fabrice636861@gmail.com> wrote:

Thanks for your reply.
The problem I see is that after creating a new subscription, we have:

1) if a failover occurs, on the new primary node, the failover and sync flags are both set to true, so there's no problem.

2) when the old node returns as a secondary in the cluster, the failover flag is set to true and the sync flag is set to false then
the error message is generated: ERROR: exiting from slot synchronization because same name slot "sub_test" already exists on the standby

Why not change the value of the synced flag when the standby is joining the cluster ? If the slot on the primary node has the same name as the slot on the secondary node and the failover flag is set to true,

if ((slot = SearchNamedReplicationSlot(remote_slot->name, true))) {
slot->data.synced = true
...

IIUC, Hou-san also mentioned the same idea, but it is not that
straightforward because the user may have created a logical slot with
the same name but with a few other different properties like
two_phase, slot_type, etc. I think we can try to compare all such slot
properties to ensure that we can overwrite the same name slot, but
there is still a chance that we may overwrite a slot that the user has
created for some other purpose. Now, we may want to extend this
functionality such that we give some knob to user which allows us to
overwrite the existing slots with same name. Then user can use this
knob (GUC or something else) when starting the node as standby after
switchover and allow the overwrite for existing slots.

As mentioned by Hou-San and Dilip, I also think it is more important
for the old node that comes as a standby to remove logical slots to
avoid WAL accumulation. For example, we can provide a function like
pg_drop_all_slots() with a type parameter indicating logical or
physical, and then utilities like patroni that provide switchover
functionality can use that function to remove all existing slots
(maybe keep the slots that are required for failover) when starting
the node as a standby.

--
With Regards,
Amit Kapila.

Fabrice Chapuis

fabrice636861@gmail.com

9 months ago

In reply to: Amit Kapila (#5)

Re: failover logical replication slots

Thanks for the reply Amit,

I don't really understand the logic of the implementation. If the slot name
matches that of the primary slot and this slot is in failover mode, how
could it be any different on the standby slot?
After the first failover, the following failovers will work given that the
sync flag is true on both the primary and standby slots.

After new sandby is attached to the primary, can we imagine that when the
sync worker process is started we check if a failover slot exists on the
standby, if so we drop it before recreating a new one for syncing?

Regards,

Fabrice

On Thu, Jun 12, 2025 at 5:14 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Show quoted text

On Wed, Jun 11, 2025 at 10:17 PM Fabrice Chapuis
<fabrice636861@gmail.com> wrote:

Thanks for your reply.
The problem I see is that after creating a new subscription, we have:

1) if a failover occurs, on the new primary node, the failover and sync

flags are both set to true, so there's no problem.

2) when the old node returns as a secondary in the cluster, the failover

flag is set to true and the sync flag is set to false then

the error message is generated: ERROR: exiting from slot

synchronization because same name slot "sub_test" already exists on the
standby

Why not change the value of the synced flag when the standby is joining

the cluster ? If the slot on the primary node has the same name as the slot
on the secondary node and the failover flag is set to true,

if ((slot = SearchNamedReplicationSlot(remote_slot->name, true))) {
slot->data.synced = true
...

IIUC, Hou-san also mentioned the same idea, but it is not that
straightforward because the user may have created a logical slot with
the same name but with a few other different properties like
two_phase, slot_type, etc. I think we can try to compare all such slot
properties to ensure that we can overwrite the same name slot, but
there is still a chance that we may overwrite a slot that the user has
created for some other purpose. Now, we may want to extend this
functionality such that we give some knob to user which allows us to
overwrite the existing slots with same name. Then user can use this
knob (GUC or something else) when starting the node as standby after
switchover and allow the overwrite for existing slots.

As mentioned by Hou-San and Dilip, I also think it is more important
for the old node that comes as a standby to remove logical slots to
avoid WAL accumulation. For example, we can provide a function like
pg_drop_all_slots() with a type parameter indicating logical or
physical, and then utilities like patroni that provide switchover
functionality can use that function to remove all existing slots
(maybe keep the slots that are required for failover) when starting
the node as a standby.

--
With Regards,
Amit Kapila.

Amit Kapila

amit.kapila16@gmail.com

9 months ago

In reply to: Fabrice Chapuis (#6)

Re: failover logical replication slots

On Thu, Jun 12, 2025 at 2:32 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:

Thanks for the reply Amit,

I don't really understand the logic of the implementation. If the slot name matches that of the primary slot and this slot is in failover mode, how could it be any different on the standby slot?

On the standby's we do allow creating logical slots (For example, one
can use pg_create_logical_replication_slot()). So, the same name slot
can be created on standby by the user before we start sync. As of now,
we don't allow setting the failover option for slots on standby's but
in future, it could be supported to allow syncing slots from standbys
(something like cascaded replication).

After the first failover, the following failovers will work given that the sync flag is true on both the primary and standby slots.

After new sandby is attached to the primary, can we imagine that when the sync worker process is started we check if a failover slot exists on the standby, if so we drop it before recreating a new one for syncing?

This has the risk of dropping an unwarranted slot.

--
With Regards,
Amit Kapila.

Amit Kapila

amit.kapila16@gmail.com

9 months ago

In reply to: Amit Kapila (#7)

Re: failover logical replication slots

On Thu, Jun 12, 2025 at 3:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Jun 12, 2025 at 2:32 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:

After the first failover, the following failovers will work given that the sync flag is true on both the primary and standby slots.

After new sandby is attached to the primary, can we imagine that when the sync worker process is started we check if a failover slot exists on the standby, if so we drop it before recreating a new one for syncing?

This has the risk of dropping an unwarranted slot.

On thinking further, even if we decide to support this functionality
of overwriting the existing slots in some way, what is guarantee that
the new standby will enable syncslot functionality (via
sync_replication_slots)? If standby doesn't enable the
sync_replication_slots then such slots will remain dangling and lead
to the accumulation of WAL. So, I think the first thing to do is to
avoid such cases, both for failover and non-failover slots. Then we
should consider ways to allow overwriting existing slots on standby in
the scenario you explained.

--
With Regards,
Amit Kapila.

Fabrice Chapuis

fabrice636861@gmail.com

9 months ago

In reply to: Amit Kapila (#7)

Re: failover logical replication slots

However, the problem still persists: it is currently not possible to
perform an automatic switchover after creating a new subscription.

Would it be reasonable to consider adding a GUC to address this issue?
I can propose a patch in that sense if it seems appropriate.

What is your opinion

Regards,

Fabrice

On Thu, Jun 12, 2025 at 11:37 AM Amit Kapila <amit.kapila16@gmail.com>
wrote:

Show quoted text

On Thu, Jun 12, 2025 at 2:32 PM Fabrice Chapuis <fabrice636861@gmail.com>
wrote:

Thanks for the reply Amit,

I don't really understand the logic of the implementation. If the slot

name matches that of the primary slot and this slot is in failover mode,
how could it be any different on the standby slot?

On the standby's we do allow creating logical slots (For example, one
can use pg_create_logical_replication_slot()). So, the same name slot
can be created on standby by the user before we start sync. As of now,
we don't allow setting the failover option for slots on standby's but
in future, it could be supported to allow syncing slots from standbys
(something like cascaded replication).

After the first failover, the following failovers will work given that

the sync flag is true on both the primary and standby slots.

After new sandby is attached to the primary, can we imagine that when

the sync worker process is started we check if a failover slot exists on
the standby, if so we drop it before recreating a new one for syncing?

This has the risk of dropping an unwarranted slot.

--
With Regards,
Amit Kapila.

#10

Amit Kapila

amit.kapila16@gmail.com

9 months ago

In reply to: Fabrice Chapuis (#9)

Re: failover logical replication slots

On Thu, Jun 12, 2025 at 3:53 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:

However, the problem still persists: it is currently not possible to perform an automatic switchover after creating a new subscription.

Would it be reasonable to consider adding a GUC to address this issue?
I can propose a patch in that sense if it seems appropriate.

Yeah, we can consider that, though I don't know at this stage if GUC
is the only way, but I hope you understand that it will be for PG19.

--
With Regards,
Amit Kapila.

#11

Fabrice Chapuis

fabrice636861@gmail.com

9 months ago

In reply to: Amit Kapila (#8)

Re: failover logical replication slots

The parameter* sync_replication_slots* could be tested if it is set to true
before doing any action on failover slots.

Regards,
Fabrice

On Thu, Jun 12, 2025 at 12:06 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

Show quoted text

On Thu, Jun 12, 2025 at 3:07 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Thu, Jun 12, 2025 at 2:32 PM Fabrice Chapuis <fabrice636861@gmail.com>

wrote:

After the first failover, the following failovers will work given that

the sync flag is true on both the primary and standby slots.

After new sandby is attached to the primary, can we imagine that when

the sync worker process is started we check if a failover slot exists on
the standby, if so we drop it before recreating a new one for syncing?

This has the risk of dropping an unwarranted slot.

On thinking further, even if we decide to support this functionality
of overwriting the existing slots in some way, what is guarantee that
the new standby will enable syncslot functionality (via
sync_replication_slots)? If standby doesn't enable the
sync_replication_slots then such slots will remain dangling and lead
to the accumulation of WAL. So, I think the first thing to do is to
avoid such cases, both for failover and non-failover slots. Then we
should consider ways to allow overwriting existing slots on standby in
the scenario you explained.

--
With Regards,
Amit Kapila.

#12

Fabrice Chapuis

fabrice636861@gmail.com

9 months ago

In reply to: Amit Kapila (#10)

Re: failover logical replication slots

yes of course, maybe for PG 19

Regards,
Fabrice

On Thu, Jun 12, 2025 at 12:31 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

Show quoted text

On Thu, Jun 12, 2025 at 3:53 PM Fabrice Chapuis <fabrice636861@gmail.com>
wrote:

However, the problem still persists: it is currently not possible to

perform an automatic switchover after creating a new subscription.

Would it be reasonable to consider adding a GUC to address this issue?
I can propose a patch in that sense if it seems appropriate.

Yeah, we can consider that, though I don't know at this stage if GUC
is the only way, but I hope you understand that it will be for PG19.

--
With Regards,
Amit Kapila.

#13

Fabrice Chapuis

fabrice636861@gmail.com

8 months ago

In reply to: Fabrice Chapuis (#12)

Re: failover logical replication slots

Hi Amit,
Here is a proposed solution to handle the problem of creating the logical
replication slot on standby after a switchover.
Thank you for your comments and help on this issue

Regards

Fabrice

diff --git a/src/backend/replication/logical/slotsync.c
b/src/backend/replication/logical/slotsync.c
index 656e66e..296840a 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -627,6 +627,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
remote_dbid)
        ReplicationSlot *slot;
        XLogRecPtr      latestFlushPtr;
        bool            slot_updated = false;
+       bool            overwriting_failover_slot = true; /* could be a GUC
*/

        /*
         * Make sure that concerned WAL is received and flushed before
syncing
@@ -654,19 +655,37 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
remote_dbid)
        if ((slot = SearchNamedReplicationSlot(remote_slot->name, true)))
        {
                bool            synced;
+               bool            failover_status = remote_slot->failover;;

SpinLockAcquire(&slot->mutex);
synced = slot->data.synced;
SpinLockRelease(&slot->mutex);

-               /* User-created slot with the same name exists, raise
ERROR. */
-               if (!synced)
-                       ereport(ERROR,
+               if (!synced){
+                       /*
+                        * Check if we need to overwrite an existing
failover slot and
+                        * if slot has the failover flag set to true
+                        * and the sync_replication_slots is on,
+                        * other check could be added here */
+                       if (overwriting_failover_slot && failover_status &&
sync_replication_slots){
+
+                               /* Get rid of a replication slot that is no
longer wanted */
+                               ReplicationSlotDrop(remote_slot->name,
true);
+                               ereport(WARNING,
+
 errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+                                       errmsg("slot \"%s\" already exists"
+                                               " on the standby but it
will be dropped because overwriting_failover_slot is set to true",
+                                               remote_slot->name));
+                               return false; /* Going back to the main
loop after droping the failover slot */
+                       }
+                       /* User-created slot with the same name exists,
raise ERROR. */
+                       else
+                               ereport(ERROR,

errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("exiting from slot
synchronization because same"
" name slot \"%s\"
already exists on the standby",
remote_slot->name));
-
+ }
/*
* The slot has been synchronized before.
*

On Thu, Jun 12, 2025 at 4:27 PM Fabrice Chapuis <fabrice636861@gmail.com>
wrote:

Show quoted text

yes of course, maybe for PG 19

Regards,
Fabrice

On Thu, Jun 12, 2025 at 12:31 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Thu, Jun 12, 2025 at 3:53 PM Fabrice Chapuis <fabrice636861@gmail.com>
wrote:

However, the problem still persists: it is currently not possible to

perform an automatic switchover after creating a new subscription.

Would it be reasonable to consider adding a GUC to address this issue?
I can propose a patch in that sense if it seems appropriate.

Yeah, we can consider that, though I don't know at this stage if GUC
is the only way, but I hope you understand that it will be for PG19.

--
With Regards,
Amit Kapila.

#14

Amit Kapila

amit.kapila16@gmail.com

8 months ago

In reply to: Fabrice Chapuis (#13)

Re: failover logical replication slots

On Fri, Jul 11, 2025 at 8:42 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:

Hi Amit,
Here is a proposed solution to handle the problem of creating the logical replication slot on standby after a switchover.
Thank you for your comments and help on this issue

Regards

Fabrice

diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 656e66e..296840a 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -627,6 +627,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
ReplicationSlot *slot;
XLogRecPtr      latestFlushPtr;
bool            slot_updated = false;
+       bool            overwriting_failover_slot = true; /* could be a GUC */

/*
* Make sure that concerned WAL is received and flushed before syncing
@@ -654,19 +655,37 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
if ((slot = SearchNamedReplicationSlot(remote_slot->name, true)))
{
bool            synced;
+               bool            failover_status = remote_slot->failover;;

SpinLockAcquire(&slot->mutex);
synced = slot->data.synced;
SpinLockRelease(&slot->mutex);

-               /* User-created slot with the same name exists, raise ERROR. */
-               if (!synced)
-                       ereport(ERROR,
+               if (!synced){
+                       /*
+                        * Check if we need to overwrite an existing failover slot and
+                        * if slot has the failover flag set to true
+                        * and the sync_replication_slots is on,
+                        * other check could be added here */
+                       if (overwriting_failover_slot && failover_status && sync_replication_slots){
+

I think we don't need to explicitly check sync_replication_slots, as
we should reach here only when that flag is set. I think we should
introduce an pg_alter_replication_slot which allows to overwrite
existing slots during sync by setting a parameter like allow_overwrite
(or something like that). This API will be useful for other purposes,
like changing two_phase or failover properties of the slot after the
creation of the slot. BTW, we also discussed supporting
pg_drop_all_slots kind of API as well. See if you are interested in
implementing that API as well.

Note: I suggest starting a new thread with the concrete proposal for
the new API or GUC, stating how it will be helpful. It might help in
getting suggestions from others as well.

--
With Regards,
Amit Kapila.

#15

Fabrice Chapuis

fabrice636861@gmail.com

8 months ago

In reply to: Amit Kapila (#14)

Re: failover logical replication slots

Thanks for this feedback,
I'll remove the check on the sync_replication_slots parameter.
I think it is interesting as you suggest to start with the idea of the
pg_alter_replication_slot API, I will make a new proposal by opening a new
thread, hoping to be supported in my approach. Is there already an
initiative about the pg_drop_all_slots API?

Best Regards,

Fabrice

On Sun, Jul 13, 2025 at 1:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Show quoted text

On Fri, Jul 11, 2025 at 8:42 PM Fabrice Chapuis <fabrice636861@gmail.com>
wrote:

Hi Amit,
Here is a proposed solution to handle the problem of creating the

logical replication slot on standby after a switchover.
Thank you for your comments and help on this issue

Regards

Fabrice
diff --git a/src/backend/replication/logical/slotsync.c
b/src/backend/replication/logical/slotsync.c
index 656e66e..296840a 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -627,6 +627,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
remote_dbid)

ReplicationSlot *slot;
XLogRecPtr latestFlushPtr;
bool slot_updated = false;
+ bool overwriting_failover_slot = true; /* could be a

GUC */

/*
* Make sure that concerned WAL is received and flushed before

syncing

@@ -654,19 +655,37 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid

remote_dbid)

if ((slot = SearchNamedReplicationSlot(remote_slot->name, true)))
{
bool synced;
+ bool failover_status = remote_slot->failover;;

SpinLockAcquire(&slot->mutex);
synced = slot->data.synced;
SpinLockRelease(&slot->mutex);

- /* User-created slot with the same name exists, raise

ERROR. */
-               if (!synced)
-                       ereport(ERROR,
+               if (!synced){
+                       /*
+                        * Check if we need to overwrite an existing
failover slot and
+                        * if slot has the failover flag set to true
+                        * and the sync_replication_slots is on,
+                        * other check could be added here */
+                       if (overwriting_failover_slot && failover_status
&& sync_replication_slots){

+

I think we don't need to explicitly check sync_replication_slots, as
we should reach here only when that flag is set. I think we should
introduce an pg_alter_replication_slot which allows to overwrite
existing slots during sync by setting a parameter like allow_overwrite
(or something like that). This API will be useful for other purposes,
like changing two_phase or failover properties of the slot after the
creation of the slot. BTW, we also discussed supporting
pg_drop_all_slots kind of API as well. See if you are interested in
implementing that API as well.

Note: I suggest starting a new thread with the concrete proposal for
the new API or GUC, stating how it will be helpful. It might help in
getting suggestions from others as well.

--
With Regards,
Amit Kapila.

#16

Amit Kapila

amit.kapila16@gmail.com

8 months ago

In reply to: Fabrice Chapuis (#15)

Re: failover logical replication slots

On Mon, Jul 14, 2025 at 9:10 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:

Thanks for this feedback,
I'll remove the check on the sync_replication_slots parameter.
I think it is interesting as you suggest to start with the idea of the pg_alter_replication_slot API, I will make a new proposal by opening a new thread, hoping to be supported in my approach. Is there already an initiative about the pg_drop_all_slots API?

No, to my knowledge, there is no other initiative for the
pg_drop_all_slots() API. I think we first got the use case where users
can drop all slots on new standby after failover to avoid excessive
resource usage.

--
With Regards,
Amit Kapila.