Clear logical slot&#39;s &#39;synced&#39; flag on promotion of standby

itsajin@gmail.com

4 months ago

In reply to: shveta malik (#1)

1 attachment(s)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Tue, Sep 9, 2025 at 4:21 PM shveta malik <shveta.malik@gmail.com> wrote:

Hi,

This is a spin-off thread from [1].

Currently, in the slot-sync worker, we have an error scenario [2]
where, during slot synchronization, if we detect a slot with the same
name and its synced flag is set to false, we emit an error. The
rationale is to avoid potentially overwriting a user-created slot.

But while analyzing [1], we observed that this error can lead to
inconsistent behavior during switchovers. On the first switchover, the
new standby logs an error: "Exiting from slot synchronization because
a slot with the same name already exists on the standby." But during
a double switchover, this error does not occur.

Upon re-evaluating this, it seems more appropriate to clear the synced
flag after promotion, as the flag does not hold any meaning on the
primary. Doing so would ensure consistent behavior across all
switchovers, as the same error will be raised avoiding the risk of
overwriting user's slots.

A patch can be posted soon on the same idea.

Hi Shveta,

Here’s a patch that addresses this issue. It clears any “synced” flags
on logical replication slots when a standby is promoted. I’ve also
added handling for crashes; if the server crashes before the flags are
cleared, they are reset on restart.
The restart logic was a bit tricky, since I had to rely on the
database state to decide when the reset is needed. Documentation on
these states is sparse, but from my testing I found that
DB_IN_CRASH_RECOVERY occurs when a standby crashes during promotion.
That’s the state I use to trigger the flag reset on restart.

regards,
Ajin Cherian
Fujitsu Australia

Attachments:

v1-0001-Reset-synced-slots-when-a-standby-is-promoted.patchapplication/octet-stream; name=v1-0001-Reset-synced-slots-when-a-standby-is-promoted.patchDownload

From 665f7b623f46247659cf42c8239a7109ae2db819 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <itsajin@gmail.com>
Date: Tue, 9 Sep 2025 17:10:22 +1000
Subject: [PATCH v1] Reset synced slots when a standby is promoted.

On promotion, reset any slots which have the 'synced' flag set so that
the primary starts from a clean state. This ensures consistent
behavior across all switchovers.
---
 src/backend/access/transam/xlog.c             | 20 +++++--
 src/backend/replication/slot.c                | 52 +++++++++++++++++++
 src/include/replication/slot.h                |  1 +
 .../t/040_standby_failover_slots_sync.pl      |  6 +--
 4 files changed, 71 insertions(+), 8 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 7ffb2179151..958e2a4271f 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5622,9 +5622,14 @@ StartupXLOG(void)
 
 	/*
 	 * Initialize replication slots, before there's a chance to remove
-	 * required resources.
+	 * required resources. Clear any leftover 'synced' flags on replication
+	 * slots when in crash recovery on the primary. The DB_IN_CRASH_RECOVERY
+	 * state check ensures that this code is only reached when a standby
+	 * server crashes during promotion.
 	 */
 	StartupReplicationSlots();
+	if (ControlFile->state == DB_IN_CRASH_RECOVERY)
+		ResetSyncedSlots();
 
 	/*
 	 * Startup logical state, needs to be setup now so we have proper data
@@ -6224,13 +6229,18 @@ StartupXLOG(void)
 	WalSndWakeup(true, true);
 
 	/*
-	 * If this was a promotion, request an (online) checkpoint now. This isn't
-	 * required for consistency, but the last restartpoint might be far back,
-	 * and in case of a crash, recovering from it might take a longer than is
-	 * appropriate now that we're not in standby mode anymore.
+	 * If this was a promotion, first reset any slots that had been marked as
+	 * synced during standby mode. Then request an (online) checkpoint.
+	 * The checkpoint isn't required for consistency, but the last
+	 * restartpoint might be far back, and in case of a crash, recovery
+	 * could take longer than desirable now that we're not in standby
+	 * mode anymore.
 	 */
 	if (promoted)
+	{
+		ResetSyncedSlots();
 		RequestCheckpoint(CHECKPOINT_FORCE);
+	}
 }
 
 /*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fd0fdb96d42..01a6e0de133 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -852,6 +852,57 @@ restart:
 	LWLockRelease(ReplicationSlotControlLock);
 }
 
+/*
+ * ResetSyncedSlots()
+ *
+ * Reset all replication slots that have synced=true to synced=false.
+ */
+void
+ResetSyncedSlots(void)
+{
+	int			i;
+
+	/*
+	 * Iterate through all replication slot entries and reset synced ones
+	 */
+	for (i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+		/* Skip inactive/unused slots */
+		if (!s->in_use)
+			continue;
+
+		/* we're only interested in logical slots */
+		if (!SlotIsLogical(s))
+			continue;
+
+		/* Check if this slot was marked as synced */
+		if (s->data.synced)
+		{
+			/* Acquire the slot */
+			ReplicationSlotAcquire(NameStr(s->data.name), false, true);
+
+			/* Reset the synced flag under spinlock protection */
+			SpinLockAcquire(&s->mutex);
+			s->data.synced = false;
+			SpinLockRelease(&s->mutex);
+
+			/* Mark dirty and save outside the spinlock */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+
+			ereport(LOG,
+				(errmsg("reset synced flag for replication slot \"%s\"",
+					NameStr(s->data.name))));
+
+			/* Release the slot */
+			ReplicationSlotRelease();
+		}
+	}
+
+}
+
 /*
  * Permanently drop replication slot identified by the passed in name.
  */
@@ -2212,6 +2263,7 @@ StartupReplicationSlots(void)
 		/* we crashed while a slot was being setup or deleted, clean up */
 		if (pg_str_endswith(replication_de->d_name, ".tmp"))
 		{
+			elog(LOG, "there was a leftover tmp file for slots");
 			if (!rmtree(path, true))
 			{
 				ereport(WARNING,
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index fe62162cde3..7902d51781d 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -336,6 +336,7 @@ extern int	ReplicationSlotIndex(ReplicationSlot *slot);
 extern bool ReplicationSlotName(int index, Name name);
 extern void ReplicationSlotNameForTablesync(Oid suboid, Oid relid, char *syncslotname, Size szslot);
 extern void ReplicationSlotDropAtPubNode(WalReceiverConn *wrconn, char *slotname, bool missing_ok);
+extern void ResetSyncedSlots(void);
 
 extern void StartupReplicationSlots(void);
 extern void CheckPointReplicationSlots(bool is_shutdown);
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index 2c61c51e914..0f225aa09c1 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -932,13 +932,13 @@ my $standby1_conninfo = $standby1->connstr . ' dbname=postgres';
 $subscriber1->safe_psql('postgres',
 	"ALTER SUBSCRIPTION regress_mysub1 CONNECTION '$standby1_conninfo';");
 
-# Confirm the synced slot 'lsub1_slot' is retained on the new primary
+# Confirm the synced slot 'lsub1_slot' is reset on the new primary
 is( $standby1->safe_psql(
 		'postgres',
 		q{SELECT count(*) = 2 FROM pg_replication_slots WHERE slot_name IN ('lsub1_slot', 'snap_test_slot') AND synced AND NOT temporary;}
 	),
-	't',
-	'synced slot retained on the new primary');
+	'f',
+	'synced slot reset on the new primary');
 
 # Commit the prepared transaction
 $standby1->safe_psql('postgres', "COMMIT PREPARED 'test_twophase_slotsync';");
-- 
2.47.3

ashu.coek88@gmail.com

4 months ago

In reply to: Ajin Cherian (#2)

Re: Clear logical slot's 'synced' flag on promotion of standby

Hi,

On Tue, Sep 9, 2025 at 12:53 PM Ajin Cherian <itsajin@gmail.com> wrote:

On Tue, Sep 9, 2025 at 4:21 PM shveta malik <shveta.malik@gmail.com> wrote:

Hi,

This is a spin-off thread from [1].

Currently, in the slot-sync worker, we have an error scenario [2]
where, during slot synchronization, if we detect a slot with the same
name and its synced flag is set to false, we emit an error. The
rationale is to avoid potentially overwriting a user-created slot.

But while analyzing [1], we observed that this error can lead to
inconsistent behavior during switchovers. On the first switchover, the
new standby logs an error: "Exiting from slot synchronization because
a slot with the same name already exists on the standby." But during
a double switchover, this error does not occur.

Upon re-evaluating this, it seems more appropriate to clear the synced
flag after promotion, as the flag does not hold any meaning on the
primary. Doing so would ensure consistent behavior across all
switchovers, as the same error will be raised avoiding the risk of
overwriting user's slots.

A patch can be posted soon on the same idea.

Hi Shveta,

Here’s a patch that addresses this issue. It clears any “synced” flags
on logical replication slots when a standby is promoted. I’ve also
added handling for crashes; if the server crashes before the flags are
cleared, they are reset on restart.
The restart logic was a bit tricky, since I had to rely on the
database state to decide when the reset is needed. Documentation on
these states is sparse, but from my testing I found that
DB_IN_CRASH_RECOVERY occurs when a standby crashes during promotion.
That’s the state I use to trigger the flag reset on restart.

+ * required resources. Clear any leftover 'synced' flags on replication
+ * slots when in crash recovery on the primary. The DB_IN_CRASH_RECOVERY
+ * state check ensures that this code is only reached when a standby
+ * server crashes during promotion.
  */
  StartupReplicationSlots();
+ if (ControlFile->state == DB_IN_CRASH_RECOVERY)

I believe the primary server can also enter the DB_IN_CRASH_RECOVERY
state. For example, if the primary is already in crash recovery and
crashes again while in crash recovery, it will restart in the
DB_IN_CRASH_RECOVERY state, no?

With this change are we saying that on primary the synced flag must be
always false. Because the postgres doc on pg_replication_slots says:

"The value of this column has no meaning on the primary server; the
column value on the primary is default false for all slots but may (if
leftover from a promoted standby) also be true."

--
With Regards,
Ashutosh Sharma.

Masahiko Sawada

sawada.mshk@gmail.com

4 months ago

In reply to: shveta malik (#1)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Mon, Sep 8, 2025 at 11:21 PM shveta malik <shveta.malik@gmail.com> wrote:

Hi,

This is a spin-off thread from [1].

Currently, in the slot-sync worker, we have an error scenario [2]
where, during slot synchronization, if we detect a slot with the same
name and its synced flag is set to false, we emit an error. The
rationale is to avoid potentially overwriting a user-created slot.

But while analyzing [1], we observed that this error can lead to
inconsistent behavior during switchovers. On the first switchover, the
new standby logs an error: "Exiting from slot synchronization because
a slot with the same name already exists on the standby." But during
a double switchover, this error does not occur.

Upon re-evaluating this, it seems more appropriate to clear the synced
flag after promotion, as the flag does not hold any meaning on the
primary. Doing so would ensure consistent behavior across all
switchovers, as the same error will be raised avoiding the risk of
overwriting user's slots.

There is the following comment in FinishWalRecovery():

/*
* Shutdown the slot sync worker to drop any temporary slots acquired by
* it and to prevent it from keep trying to fetch the failover slots.
*
* We do not update the 'synced' column in 'pg_replication_slots' system
* view from true to false here, as any failed update could leave 'synced'
* column false for some slots. This could cause issues during slot sync
* after restarting the server as a standby. While updating the 'synced'
* column after switching to the new timeline is an option, it does not
* simplify the handling for the 'synced' column. Therefore, we retain the
* 'synced' column as true after promotion as it may provide useful
* information about the slot origin.
*/
ShutDownSlotSync();

Does the patch address the above concerns?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

shveta.malik@gmail.com

4 months ago

In reply to: Ashutosh Sharma (#3)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Tue, Sep 9, 2025 at 2:19 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

Hi,
+ * required resources. Clear any leftover 'synced' flags on replication
+ * slots when in crash recovery on the primary. The DB_IN_CRASH_RECOVERY
+ * state check ensures that this code is only reached when a standby
+ * server crashes during promotion.
*/
StartupReplicationSlots();
+ if (ControlFile->state == DB_IN_CRASH_RECOVERY)
I believe the primary server can also enter the DB_IN_CRASH_RECOVERY
state. For example, if the primary is already in crash recovery and
crashes again while in crash recovery, it will restart in the
DB_IN_CRASH_RECOVERY state, no?

Yes, good point. I think we can differentiate the two cases based on
the timeline change. A regular primary won't have a timeline change,
whereas a promoted standby that failed during promotion will show a
timeline change immediately upon restart. Thoughts?

In the worst-case scenario, even if we end up running the Reset
function during a regular primary's crash recovery, it shouldn't cause
any harm. (That said, I'm not suggesting we shouldn't fix it). What
concerns me more is the possibility of running it on a regular
standby, as it could disrupt slot synchronization. I attempted to
simulate a scenario where a regular standby ends up in
DB_IN_CRASH_RECOVERY after a crash, but I couldn't reproduce it. Do
you know of any situation where this could happen? The absence of
comments for these states makes it challenging to follow the flow.

--

With this change are we saying that on primary the synced flag must be
always false. Because the postgres doc on pg_replication_slots says:

"The value of this column has no meaning on the primary server; the
column value on the primary is default false for all slots but may (if
leftover from a promoted standby) also be true."

The doc needs change.

thanks
Shveta

[1]: /messages/by-id/CAE9k0P=WXRHXLGxkegFLj9tVLrY45+uTtdgv+Pjt1mqyit4zZw@mail.gmail.com

shveta.malik@gmail.com

4 months ago

In reply to: Masahiko Sawada (#4)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Wed, Sep 10, 2025 at 5:23 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Sep 8, 2025 at 11:21 PM shveta malik <shveta.malik@gmail.com> wrote:

Hi,

This is a spin-off thread from [1].

Currently, in the slot-sync worker, we have an error scenario [2]
where, during slot synchronization, if we detect a slot with the same
name and its synced flag is set to false, we emit an error. The
rationale is to avoid potentially overwriting a user-created slot.

But while analyzing [1], we observed that this error can lead to
inconsistent behavior during switchovers. On the first switchover, the
new standby logs an error: "Exiting from slot synchronization because
a slot with the same name already exists on the standby." But during
a double switchover, this error does not occur.

Upon re-evaluating this, it seems more appropriate to clear the synced
flag after promotion, as the flag does not hold any meaning on the
primary. Doing so would ensure consistent behavior across all
switchovers, as the same error will be raised avoiding the risk of
overwriting user's slots.

There is the following comment in FinishWalRecovery():

/*
* Shutdown the slot sync worker to drop any temporary slots acquired by
* it and to prevent it from keep trying to fetch the failover slots.
*
* We do not update the 'synced' column in 'pg_replication_slots' system
* view from true to false here, as any failed update could leave 'synced'
* column false for some slots. This could cause issues during slot sync
* after restarting the server as a standby. While updating the 'synced'
* column after switching to the new timeline is an option, it does not
* simplify the handling for the 'synced' column. Therefore, we retain the
* 'synced' column as true after promotion as it may provide useful
* information about the slot origin.
*/
ShutDownSlotSync();

Does the patch address the above concerns?

Yes, the patch is attempting to address the above concern. it is
trying to Reset synced-column after switching to a new timeline. There
is an issue though as pointed out by Ashutosh in [1]/messages/by-id/CAE9k0P=WXRHXLGxkegFLj9tVLrY45+uTtdgv+Pjt1mqyit4zZw@mail.gmail.com, which needs to
be addressed.

thanks
Shveta

ashu.coek88@gmail.com

4 months ago

In reply to: shveta malik (#5)

Re: Clear logical slot's 'synced' flag on promotion of standby

Hi,

On Thu, Sep 11, 2025 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote:

On Tue, Sep 9, 2025 at 2:19 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
Hi,
+ * required resources. Clear any leftover 'synced' flags on replication
+ * slots when in crash recovery on the primary. The DB_IN_CRASH_RECOVERY
+ * state check ensures that this code is only reached when a standby
+ * server crashes during promotion.
*/
StartupReplicationSlots();
+ if (ControlFile->state == DB_IN_CRASH_RECOVERY)
I believe the primary server can also enter the DB_IN_CRASH_RECOVERY
state. For example, if the primary is already in crash recovery and
crashes again while in crash recovery, it will restart in the
DB_IN_CRASH_RECOVERY state, no?
Yes, good point. I think we can differentiate the two cases based on
the timeline change. A regular primary won't have a timeline change,
whereas a promoted standby that failed during promotion will show a
timeline change immediately upon restart. Thoughts?

We already read the recovery signal files (standby.signal or
recovery.signal) at the start of StartupXLOG() via InitWalRecovery(),
which sets the StandbyModeRequested flag. Couldn’t we use this to
distinguish whether the server is a primary undergoing crash recovery
or a standby?

I attempted to

simulate a scenario where a regular standby ends up in
DB_IN_CRASH_RECOVERY after a crash, but I couldn't reproduce it. Do
you know of any situation where this could happen? The absence of
comments for these states makes it challenging to follow the flow.

The log message for "case DB_IN_CRASH_RECOVERY:" inside StartupXLOG
should indicate that the server has entered crash recovery, no? And..
If you still want the server to crash while in this state, you could
add your own PANIC or FATAL error message inside the startupxlog.

--
With Regards,
Ashutosh Sharma.

ashu.coek88@gmail.com

4 months ago

In reply to: shveta malik (#5)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Thu, Sep 11, 2025 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote:

On Tue, Sep 9, 2025 at 2:19 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
Hi,
+ * required resources. Clear any leftover 'synced' flags on replication
+ * slots when in crash recovery on the primary. The DB_IN_CRASH_RECOVERY
+ * state check ensures that this code is only reached when a standby
+ * server crashes during promotion.
*/
StartupReplicationSlots();
+ if (ControlFile->state == DB_IN_CRASH_RECOVERY)
I believe the primary server can also enter the DB_IN_CRASH_RECOVERY
state. For example, if the primary is already in crash recovery and
crashes again while in crash recovery, it will restart in the
DB_IN_CRASH_RECOVERY state, no?
Yes, good point. I think we can differentiate the two cases based on
the timeline change. A regular primary won't have a timeline change,
whereas a promoted standby that failed during promotion will show a
timeline change immediately upon restart. Thoughts?

Will there be any issues if we clear the sync status immediately after
the standby.signal file is removed from the standby server?

We could maybe introduce a temporary "promote.inprogress" marker file
on disk before removing standby.signal. The sequence would be:

1) Create promote.inprogress.
2) Unlink standby.signal
3) Clear the sync slot status.
4) Remove promote.inprogress.

This way, if the server crashes after standby.signal is removed but
before the sync status is cleared, the presence of promote.inprogress
would indicate that the standby was in the middle of promotion and
crashed before slot cleanup. On restart, we could use that marker to
detect the incomplete promotion and finish clearing the sync flags.

If the crash happens at a later stage, the server will no longer start
as a standby anyway, and by then the sync flags would already have
been reset.

This is just a thought and it may sound a bit naive. Let me know if I
am overlooking something.

--
With Regards,
Ashutosh Sharma.

shveta.malik@gmail.com

4 months ago

In reply to: Ashutosh Sharma (#7)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Thu, Sep 11, 2025 at 3:16 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

Hi,

We already read the recovery signal files (standby.signal or
recovery.signal) at the start of StartupXLOG() via InitWalRecovery(),
which sets the StandbyModeRequested flag. Couldn’t we use this to
distinguish whether the server is a primary undergoing crash recovery
or a standby?

The objective is not to distinguish between a primary and a standby
undergoing crash recovery, but to differentiate between a primary
undergoing crash recovery and a promoted standby (now the new primary)
during the immediate next startup—specifically in cases where the
promotion failed late in the process, such as during ResetSyncedSlots.
StandbyModeRequested will be false in both the cases and thus cannot
be used.

I attempted to

simulate a scenario where a regular standby ends up in
DB_IN_CRASH_RECOVERY after a crash, but I couldn't reproduce it. Do
you know of any situation where this could happen? The absence of
comments for these states makes it challenging to follow the flow.

The log message for "case DB_IN_CRASH_RECOVERY:" inside StartupXLOG
should indicate that the server has entered crash recovery, no? And..
If you still want the server to crash while in this state, you could
add your own PANIC or FATAL error message inside the startupxlog.

I think my query was not correctly understood. The intent was to know
if we can ever hit 'DB_IN_CRASH_RECOVERY' on a regular standby. But, I
think the answer is 'No'. We can hit it on primary (before , during,
or after promotion after a crash).

thanks
Shveta

#10

shveta.malik@gmail.com

4 months ago

In reply to: Ashutosh Sharma (#8)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Thu, Sep 11, 2025 at 7:29 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

On Thu, Sep 11, 2025 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote:
On Tue, Sep 9, 2025 at 2:19 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
Hi,
+ * required resources. Clear any leftover 'synced' flags on replication
+ * slots when in crash recovery on the primary. The DB_IN_CRASH_RECOVERY
+ * state check ensures that this code is only reached when a standby
+ * server crashes during promotion.
*/
StartupReplicationSlots();
+ if (ControlFile->state == DB_IN_CRASH_RECOVERY)
I believe the primary server can also enter the DB_IN_CRASH_RECOVERY
state. For example, if the primary is already in crash recovery and
crashes again while in crash recovery, it will restart in the
DB_IN_CRASH_RECOVERY state, no?
Yes, good point. I think we can differentiate the two cases based on
the timeline change. A regular primary won't have a timeline change,
whereas a promoted standby that failed during promotion will show a
timeline change immediately upon restart. Thoughts?
Will there be any issues if we clear the sync status immediately after
the standby.signal file is removed from the standby server?

We could maybe introduce a temporary "promote.inprogress" marker file
on disk before removing standby.signal. The sequence would be:

1) Create promote.inprogress.
2) Unlink standby.signal
3) Clear the sync slot status.
4) Remove promote.inprogress.

This way, if the server crashes after standby.signal is removed but
before the sync status is cleared, the presence of promote.inprogress
would indicate that the standby was in the middle of promotion and
crashed before slot cleanup. On restart, we could use that marker to
detect the incomplete promotion and finish clearing the sync flags.

If the crash happens at a later stage, the server will no longer start
as a standby anyway, and by then the sync flags would already have
been reset.

This is just a thought and it may sound a bit naive. Let me know if I
am overlooking something.

The approach seems valid and should work, but introducing a new file
like promote.inprogress for this purpose might be excessive. We can
first try analyzing existing information to determine whether we can
distinguish between the two scenarios -- a primary in crash recovery
immediately after a promotion attempt versus a regular primary. If we
are unable to find any way, we can revisit the idea.

thanks
Shveta

#11

itsajin@gmail.com

4 months ago

In reply to: shveta malik (#10)

1 attachment(s)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Fri, Sep 12, 2025 at 1:56 PM shveta malik <shveta.malik@gmail.com> wrote:

The approach seems valid and should work, but introducing a new file
like promote.inprogress for this purpose might be excessive. We can
first try analyzing existing information to determine whether we can
distinguish between the two scenarios -- a primary in crash recovery
immediately after a promotion attempt versus a regular primary. If we
are unable to find any way, we can revisit the idea.

I needed a way to reset slots not only during promotion, but also
after a crash that occurs while slots are being reset, so there would
be a fallback mechanism to clear them again on startup. As Shveta
pointed out, it wasn’t trivial to tell apart a standby restarting
after crashing during promotion from a primary restarting after a
crash. So I decided to just reset slots every time primary (or a
standby after promotion) restarts.

Because this fallback logic will run on every primary restart, it was
important to minimize overhead added by the patch. After some
discussion, I placed the reset logic in RestoreSlotFromDisk(), which
is invoked by StartupReplicationSlots() whenever the server starts.
Because RestoreSlotFromDisk() already loops through all slots, this
adds minimum extra work; but also ensures the synced flag is cleared
when running on a primary.

The next challenge was finding a reliable flag to distinguish
primaries from standbys, since we really don’t want to reset the flag
on a standby. I tested StandbyMode, RecoveryInProgress(), and
InRecovery. But during restarts, both RecoveryInProgress() and
InRecovery are always true on both primary and standby. In all my
testing, StandbyMode was the only variable that consistently
differentiated between the two, which is what I used.

I have also changed the documentation and comments regarding 'synced'
flags not being reset on the primary.

regards,
Ajin Cherian
Fujitsu Australia

Attachments:

v2-0001-Reset-synced-slots-when-a-standby-is-promoted.patchapplication/octet-stream; name=v2-0001-Reset-synced-slots-when-a-standby-is-promoted.patchDownload

From 7ac4df2826f17b22accce0d7b257bc7003b1c98a Mon Sep 17 00:00:00 2001
From: Ajin Cherian <itsajin@gmail.com>
Date: Thu, 18 Sep 2025 20:33:36 +1000
Subject: [PATCH v2] Reset synced slots when a standby is promoted.

On promotion, reset any slots which have the 'synced' flag set so that
the primary starts from a clean state. This ensures consistent
behavior across all switchovers.
---
 doc/src/sgml/system-views.sgml                |  3 +-
 src/backend/access/transam/xlog.c             | 16 ++++--
 src/backend/access/transam/xlogrecovery.c     |  9 ---
 src/backend/replication/slot.c                | 55 +++++++++++++++++++
 src/include/replication/slot.h                |  1 +
 .../t/040_standby_failover_slots_sync.pl      |  6 +-
 6 files changed, 71 insertions(+), 19 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 4187191ea74..ff9384127cd 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -3031,8 +3031,7 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        On a hot standby, the slots with the synced column marked as true can
        neither be used for logical decoding nor dropped manually. The value
        of this column has no meaning on the primary server; the column value on
-       the primary is default false for all slots but may (if leftover from a
-       promoted standby) also be true.
+       the primary is false for all slots.
       </para></entry>
      </row>
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 0baf0ac6160..7b7e6989d55 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5622,7 +5622,8 @@ StartupXLOG(void)
 
 	/*
 	 * Initialize replication slots, before there's a chance to remove
-	 * required resources.
+	 * required resources. Clear any leftover 'synced' flags on replication
+	 * slots when on the primary.
 	 */
 	StartupReplicationSlots();
 
@@ -6224,13 +6225,18 @@ StartupXLOG(void)
 	WalSndWakeup(true, true);
 
 	/*
-	 * If this was a promotion, request an (online) checkpoint now. This isn't
-	 * required for consistency, but the last restartpoint might be far back,
-	 * and in case of a crash, recovering from it might take a longer than is
-	 * appropriate now that we're not in standby mode anymore.
+	 * If this was a promotion, first reset any slots that had been marked as
+	 * synced during standby mode. Then request an (online) checkpoint.
+	 * The checkpoint isn't required for consistency, but the last
+	 * restartpoint might be far back, and in case of a crash, recovery
+	 * could take longer than desirable now that we're not in standby
+	 * mode anymore.
 	 */
 	if (promoted)
+	{
+		ResetSyncedSlots();
 		RequestCheckpoint(CHECKPOINT_FORCE);
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 346319338a0..37ad309201e 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -1481,15 +1481,6 @@ FinishWalRecovery(void)
 	/*
 	 * Shutdown the slot sync worker to drop any temporary slots acquired by
 	 * it and to prevent it from keep trying to fetch the failover slots.
-	 *
-	 * We do not update the 'synced' column in 'pg_replication_slots' system
-	 * view from true to false here, as any failed update could leave 'synced'
-	 * column false for some slots. This could cause issues during slot sync
-	 * after restarting the server as a standby. While updating the 'synced'
-	 * column after switching to the new timeline is an option, it does not
-	 * simplify the handling for the 'synced' column. Therefore, we retain the
-	 * 'synced' column as true after promotion as it may provide useful
-	 * information about the slot origin.
 	 */
 	ShutDownSlotSync();
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fd0fdb96d42..95fae5e49ad 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -852,6 +852,57 @@ restart:
 	LWLockRelease(ReplicationSlotControlLock);
 }
 
+/*
+ * ResetSyncedSlots()
+ *
+ * Reset all replication slots that have synced=true to synced=false.
+ */
+void
+ResetSyncedSlots(void)
+{
+	int			i;
+
+	/*
+	 * Iterate through all replication slot entries and reset synced ones
+	 */
+	for (i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+		/* Skip inactive/unused slots */
+		if (!s->in_use)
+			continue;
+
+		/* we're only interested in logical slots */
+		if (!SlotIsLogical(s))
+			continue;
+
+		/* Check if this slot was marked as synced */
+		if (s->data.synced)
+		{
+			/* Acquire the slot */
+			ReplicationSlotAcquire(NameStr(s->data.name), false, true);
+
+			/* Reset the synced flag under spinlock protection */
+			SpinLockAcquire(&s->mutex);
+			s->data.synced = false;
+			SpinLockRelease(&s->mutex);
+
+			/* Mark dirty and save outside the spinlock */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+
+			ereport(LOG,
+				(errmsg("reset synced flag for replication slot \"%s\"",
+					NameStr(s->data.name))));
+
+			/* Release the slot */
+			ReplicationSlotRelease();
+		}
+	}
+
+}
+
 /*
  * Permanently drop replication slot identified by the passed in name.
  */
@@ -2664,6 +2715,10 @@ RestoreSlotFromDisk(const char *name)
 		memcpy(&slot->data, &cp.slotdata,
 			   sizeof(ReplicationSlotPersistentData));
 
+		/* reset synced flag if this is a primary server */
+		if (!StandbyMode)
+			slot->data.synced = false;
+
 		/* initialize in memory state */
 		slot->effective_xmin = cp.slotdata.xmin;
 		slot->effective_catalog_xmin = cp.slotdata.catalog_xmin;
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index fe62162cde3..7902d51781d 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -336,6 +336,7 @@ extern int	ReplicationSlotIndex(ReplicationSlot *slot);
 extern bool ReplicationSlotName(int index, Name name);
 extern void ReplicationSlotNameForTablesync(Oid suboid, Oid relid, char *syncslotname, Size szslot);
 extern void ReplicationSlotDropAtPubNode(WalReceiverConn *wrconn, char *slotname, bool missing_ok);
+extern void ResetSyncedSlots(void);
 
 extern void StartupReplicationSlots(void);
 extern void CheckPointReplicationSlots(bool is_shutdown);
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index 2c61c51e914..0f225aa09c1 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -932,13 +932,13 @@ my $standby1_conninfo = $standby1->connstr . ' dbname=postgres';
 $subscriber1->safe_psql('postgres',
 	"ALTER SUBSCRIPTION regress_mysub1 CONNECTION '$standby1_conninfo';");
 
-# Confirm the synced slot 'lsub1_slot' is retained on the new primary
+# Confirm the synced slot 'lsub1_slot' is reset on the new primary
 is( $standby1->safe_psql(
 		'postgres',
 		q{SELECT count(*) = 2 FROM pg_replication_slots WHERE slot_name IN ('lsub1_slot', 'snap_test_slot') AND synced AND NOT temporary;}
 	),
-	't',
-	'synced slot retained on the new primary');
+	'f',
+	'synced slot reset on the new primary');
 
 # Commit the prepared transaction
 $standby1->safe_psql('postgres', "COMMIT PREPARED 'test_twophase_slotsync';");
-- 
2.47.3

#12

ashu.coek88@gmail.com

4 months ago

In reply to: Ajin Cherian (#11)

Re: Clear logical slot's 'synced' flag on promotion of standby

Hi Ajin,

On Thu, Sep 18, 2025 at 4:16 PM Ajin Cherian <itsajin@gmail.com> wrote:

On Fri, Sep 12, 2025 at 1:56 PM shveta malik <shveta.malik@gmail.com> wrote:

The approach seems valid and should work, but introducing a new file
like promote.inprogress for this purpose might be excessive. We can
first try analyzing existing information to determine whether we can
distinguish between the two scenarios -- a primary in crash recovery
immediately after a promotion attempt versus a regular primary. If we
are unable to find any way, we can revisit the idea.

I needed a way to reset slots not only during promotion, but also
after a crash that occurs while slots are being reset, so there would
be a fallback mechanism to clear them again on startup. As Shveta
pointed out, it wasn’t trivial to tell apart a standby restarting
after crashing during promotion from a primary restarting after a
crash. So I decided to just reset slots every time primary (or a
standby after promotion) restarts.

Because this fallback logic will run on every primary restart, it was
important to minimize overhead added by the patch. After some
discussion, I placed the reset logic in RestoreSlotFromDisk(), which
is invoked by StartupReplicationSlots() whenever the server starts.
Because RestoreSlotFromDisk() already loops through all slots, this
adds minimum extra work; but also ensures the synced flag is cleared
when running on a primary.

The next challenge was finding a reliable flag to distinguish
primaries from standbys, since we really don’t want to reset the flag
on a standby. I tested StandbyMode, RecoveryInProgress(), and
InRecovery. But during restarts, both RecoveryInProgress() and
InRecovery are always true on both primary and standby. In all my
testing, StandbyMode was the only variable that consistently
differentiated between the two, which is what I used.

+/*
+ * ResetSyncedSlots()
+ *
+ * Reset all replication slots that have synced=true to synced=false.
+ */

I feel this is not correct, we are force resetting sync flag status
for all logical slots, not just the one that is set to true.

@@ -2664,6 +2715,10 @@ RestoreSlotFromDisk(const char *name)
memcpy(&slot->data, &cp.slotdata,
sizeof(ReplicationSlotPersistentData));

+ /* reset synced flag if this is a primary server */
+ if (!StandbyMode)
+ slot->data.synced = false;
+

I think you also need to ensure that you are only doing this for the
logical slots, it's currently just checking if the slot is in-use or
not.

--
With Regards,
Ashutosh Sharma.

#13

ashu.coek88@gmail.com

4 months ago

In reply to: Ashutosh Sharma (#12)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Thu, Sep 18, 2025 at 5:20 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

Hi Ajin,

On Thu, Sep 18, 2025 at 4:16 PM Ajin Cherian <itsajin@gmail.com> wrote:

On Fri, Sep 12, 2025 at 1:56 PM shveta malik <shveta.malik@gmail.com> wrote:

The approach seems valid and should work, but introducing a new file
like promote.inprogress for this purpose might be excessive. We can
first try analyzing existing information to determine whether we can
distinguish between the two scenarios -- a primary in crash recovery
immediately after a promotion attempt versus a regular primary. If we
are unable to find any way, we can revisit the idea.

I needed a way to reset slots not only during promotion, but also
after a crash that occurs while slots are being reset, so there would
be a fallback mechanism to clear them again on startup. As Shveta
pointed out, it wasn’t trivial to tell apart a standby restarting
after crashing during promotion from a primary restarting after a
crash. So I decided to just reset slots every time primary (or a
standby after promotion) restarts.

Because this fallback logic will run on every primary restart, it was
important to minimize overhead added by the patch. After some
discussion, I placed the reset logic in RestoreSlotFromDisk(), which
is invoked by StartupReplicationSlots() whenever the server starts.
Because RestoreSlotFromDisk() already loops through all slots, this
adds minimum extra work; but also ensures the synced flag is cleared
when running on a primary.

The next challenge was finding a reliable flag to distinguish
primaries from standbys, since we really don’t want to reset the flag
on a standby. I tested StandbyMode, RecoveryInProgress(), and
InRecovery. But during restarts, both RecoveryInProgress() and
InRecovery are always true on both primary and standby. In all my
testing, StandbyMode was the only variable that consistently
differentiated between the two, which is what I used.
+/*
+ * ResetSyncedSlots()
+ *
+ * Reset all replication slots that have synced=true to synced=false.
+ */
I feel this is not correct, we are force resetting sync flag status
for all logical slots, not just the one that is set to true.

You may ignore this, it's actually resetting only the synced slots.
Sorry for the noise.

--
With Regards,
Ashutosh Sharma.

#14

shveta.malik@gmail.com

4 months ago

In reply to: Ajin Cherian (#11)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Thu, Sep 18, 2025 at 4:16 PM Ajin Cherian <itsajin@gmail.com> wrote:

On Fri, Sep 12, 2025 at 1:56 PM shveta malik <shveta.malik@gmail.com> wrote:

The approach seems valid and should work, but introducing a new file
like promote.inprogress for this purpose might be excessive. We can
first try analyzing existing information to determine whether we can
distinguish between the two scenarios -- a primary in crash recovery
immediately after a promotion attempt versus a regular primary. If we
are unable to find any way, we can revisit the idea.

I needed a way to reset slots not only during promotion, but also
after a crash that occurs while slots are being reset, so there would
be a fallback mechanism to clear them again on startup. As Shveta
pointed out, it wasn’t trivial to tell apart a standby restarting
after crashing during promotion from a primary restarting after a
crash. So I decided to just reset slots every time primary (or a
standby after promotion) restarts.

Because this fallback logic will run on every primary restart, it was
important to minimize overhead added by the patch. After some
discussion, I placed the reset logic in RestoreSlotFromDisk(), which
is invoked by StartupReplicationSlots() whenever the server starts.
Because RestoreSlotFromDisk() already loops through all slots, this
adds minimum extra work; but also ensures the synced flag is cleared
when running on a primary.

+1 for the idea. I would like to know what others think here.

The next challenge was finding a reliable flag to distinguish
primaries from standbys, since we really don’t want to reset the flag
on a standby. I tested StandbyMode, RecoveryInProgress(), and
InRecovery. But during restarts, both RecoveryInProgress() and
InRecovery are always true on both primary and standby. In all my
testing, StandbyMode was the only variable that consistently
differentiated between the two, which is what I used.

I have also changed the documentation and comments regarding 'synced'
flags not being reset on the primary.

Please find a few comments:

1)
+ * Reset all replication slots that have synced=true to synced=false.

Can we please change it to:
Reset the synced flag to false for all replication slots where it is
currently true.

2)
I was wondering that since we reset the sync flag everytime we load
slots from disk , then do we even need ResetSyncedSlots() during
promotion? But I guess we still need it because even after promotion
(if not restarted), existing backend sessions stay alive and it makes
sense if they too see 'synced' as false after promotion. Is it worth
adding this in comments atop ResetSyncedSlots() call during promotion?

3)
+ if (!StandbyMode)
+ slot->data.synced = false;

a)
Do we need to mark the slot as dirty so it gets saved to disk on the
next chance?

I think ReplicationSlotSave can be skipped, as it may not be
appropriate in the restore flow. But marking the slot dirty is
important to avoid resetting the sync flag again on the next startup.
A crash between marking it dirty and persisting it would still require
a reset, but that seems acceptable. Thoughts?

b)
Also if we are marking it dirty, it makes sense to set synced to false
only after checking if synced is true already.

thanks
Shveta

#15

shveta.malik@gmail.com

4 months ago

In reply to: Ashutosh Sharma (#12)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Thu, Sep 18, 2025 at 5:20 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

Hi Ajin,

On Thu, Sep 18, 2025 at 4:16 PM Ajin Cherian <itsajin@gmail.com> wrote:

On Fri, Sep 12, 2025 at 1:56 PM shveta malik <shveta.malik@gmail.com> wrote:

The approach seems valid and should work, but introducing a new file
like promote.inprogress for this purpose might be excessive. We can
first try analyzing existing information to determine whether we can
distinguish between the two scenarios -- a primary in crash recovery
immediately after a promotion attempt versus a regular primary. If we
are unable to find any way, we can revisit the idea.

I needed a way to reset slots not only during promotion, but also
after a crash that occurs while slots are being reset, so there would
be a fallback mechanism to clear them again on startup. As Shveta
pointed out, it wasn’t trivial to tell apart a standby restarting
after crashing during promotion from a primary restarting after a
crash. So I decided to just reset slots every time primary (or a
standby after promotion) restarts.

Because this fallback logic will run on every primary restart, it was
important to minimize overhead added by the patch. After some
discussion, I placed the reset logic in RestoreSlotFromDisk(), which
is invoked by StartupReplicationSlots() whenever the server starts.
Because RestoreSlotFromDisk() already loops through all slots, this
adds minimum extra work; but also ensures the synced flag is cleared
when running on a primary.

The next challenge was finding a reliable flag to distinguish
primaries from standbys, since we really don’t want to reset the flag
on a standby. I tested StandbyMode, RecoveryInProgress(), and
InRecovery. But during restarts, both RecoveryInProgress() and
InRecovery are always true on both primary and standby. In all my
testing, StandbyMode was the only variable that consistently
differentiated between the two, which is what I used.
+/*
+ * ResetSyncedSlots()
+ *
+ * Reset all replication slots that have synced=true to synced=false.
+ */
I feel this is not correct, we are force resetting sync flag status
for all logical slots, not just the one that is set to true.

--

@@ -2664,6 +2715,10 @@ RestoreSlotFromDisk(const char *name)
memcpy(&slot->data, &cp.slotdata,
sizeof(ReplicationSlotPersistentData));
+ /* reset synced flag if this is a primary server */
+ if (!StandbyMode)
+ slot->data.synced = false;
+
I think you also need to ensure that you are only doing this for the
logical slots, it's currently just checking if the slot is in-use or
not.

I think a better approach would be to reset synced only if it is
marked as synced. Adding a LogicalSlot check wouldn't be incorrect,
but IMO, it may not be necessary.

thanks
Shveta

#16

ashu.coek88@gmail.com

4 months ago

In reply to: shveta malik (#15)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Fri, Sep 19, 2025 at 3:04 PM shveta malik <shveta.malik@gmail.com> wrote:

On Thu, Sep 18, 2025 at 5:20 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
Hi Ajin,

On Thu, Sep 18, 2025 at 4:16 PM Ajin Cherian <itsajin@gmail.com> wrote:

On Fri, Sep 12, 2025 at 1:56 PM shveta malik <shveta.malik@gmail.com> wrote:

The approach seems valid and should work, but introducing a new file
like promote.inprogress for this purpose might be excessive. We can
first try analyzing existing information to determine whether we can
distinguish between the two scenarios -- a primary in crash recovery
immediately after a promotion attempt versus a regular primary. If we
are unable to find any way, we can revisit the idea.

I needed a way to reset slots not only during promotion, but also
after a crash that occurs while slots are being reset, so there would
be a fallback mechanism to clear them again on startup. As Shveta
pointed out, it wasn’t trivial to tell apart a standby restarting
after crashing during promotion from a primary restarting after a
crash. So I decided to just reset slots every time primary (or a
standby after promotion) restarts.

Because this fallback logic will run on every primary restart, it was
important to minimize overhead added by the patch. After some
discussion, I placed the reset logic in RestoreSlotFromDisk(), which
is invoked by StartupReplicationSlots() whenever the server starts.
Because RestoreSlotFromDisk() already loops through all slots, this
adds minimum extra work; but also ensures the synced flag is cleared
when running on a primary.

The next challenge was finding a reliable flag to distinguish
primaries from standbys, since we really don’t want to reset the flag
on a standby. I tested StandbyMode, RecoveryInProgress(), and
InRecovery. But during restarts, both RecoveryInProgress() and
InRecovery are always true on both primary and standby. In all my
testing, StandbyMode was the only variable that consistently
differentiated between the two, which is what I used.
+/*
+ * ResetSyncedSlots()
+ *
+ * Reset all replication slots that have synced=true to synced=false.
+ */
I feel this is not correct, we are force resetting sync flag status
for all logical slots, not just the one that is set to true.

--

@@ -2664,6 +2715,10 @@ RestoreSlotFromDisk(const char *name)
memcpy(&slot->data, &cp.slotdata,
sizeof(ReplicationSlotPersistentData));
+ /* reset synced flag if this is a primary server */
+ if (!StandbyMode)
+ slot->data.synced = false;
+
I think you also need to ensure that you are only doing this for the
logical slots, it's currently just checking if the slot is in-use or
not.
I think a better approach would be to reset synced only if it is
marked as synced. Adding a LogicalSlot check wouldn't be incorrect,
but IMO, it may not be necessary.

Thinking further on this, I believe it’s fine even if the slot is
forcefully set to false on the primary. The slot type or its sync
status doesn’t really matter in this case.

--
With Regards,
Ashutosh Sharma.

#17

shveta.malik@gmail.com

4 months ago

In reply to: Ashutosh Sharma (#16)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Fri, Sep 19, 2025 at 7:29 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

On Fri, Sep 19, 2025 at 3:04 PM shveta malik <shveta.malik@gmail.com> wrote:
On Thu, Sep 18, 2025 at 5:20 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
Hi Ajin,

On Thu, Sep 18, 2025 at 4:16 PM Ajin Cherian <itsajin@gmail.com> wrote:

On Fri, Sep 12, 2025 at 1:56 PM shveta malik <shveta.malik@gmail.com> wrote:

The approach seems valid and should work, but introducing a new file
like promote.inprogress for this purpose might be excessive. We can
first try analyzing existing information to determine whether we can
distinguish between the two scenarios -- a primary in crash recovery
immediately after a promotion attempt versus a regular primary. If we
are unable to find any way, we can revisit the idea.

I needed a way to reset slots not only during promotion, but also
after a crash that occurs while slots are being reset, so there would
be a fallback mechanism to clear them again on startup. As Shveta
pointed out, it wasn’t trivial to tell apart a standby restarting
after crashing during promotion from a primary restarting after a
crash. So I decided to just reset slots every time primary (or a
standby after promotion) restarts.

Because this fallback logic will run on every primary restart, it was
important to minimize overhead added by the patch. After some
discussion, I placed the reset logic in RestoreSlotFromDisk(), which
is invoked by StartupReplicationSlots() whenever the server starts.
Because RestoreSlotFromDisk() already loops through all slots, this
adds minimum extra work; but also ensures the synced flag is cleared
when running on a primary.

The next challenge was finding a reliable flag to distinguish
primaries from standbys, since we really don’t want to reset the flag
on a standby. I tested StandbyMode, RecoveryInProgress(), and
InRecovery. But during restarts, both RecoveryInProgress() and
InRecovery are always true on both primary and standby. In all my
testing, StandbyMode was the only variable that consistently
differentiated between the two, which is what I used.
+/*
+ * ResetSyncedSlots()
+ *
+ * Reset all replication slots that have synced=true to synced=false.
+ */
I feel this is not correct, we are force resetting sync flag status
for all logical slots, not just the one that is set to true.

--

@@ -2664,6 +2715,10 @@ RestoreSlotFromDisk(const char *name)
memcpy(&slot->data, &cp.slotdata,
sizeof(ReplicationSlotPersistentData));
+ /* reset synced flag if this is a primary server */
+ if (!StandbyMode)
+ slot->data.synced = false;
+
I think you also need to ensure that you are only doing this for the
logical slots, it's currently just checking if the slot is in-use or
not.
I think a better approach would be to reset synced only if it is
marked as synced. Adding a LogicalSlot check wouldn't be incorrect,
but IMO, it may not be necessary.
Thinking further on this, I believe it’s fine even if the slot is
forcefully set to false on the primary. The slot type or its sync
status doesn’t really matter in this case.

I think we need to mark the slots dirty once we set its synced-flag to
false. In such a case, we should only mark those which actually needed
synced flag resetting. IMO, we need a 'synced' flag check here.

thanks
Shveta

#18

ashu.coek88@gmail.com

4 months ago

In reply to: shveta malik (#17)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Mon, Sep 22, 2025 at 8:58 AM shveta malik <shveta.malik@gmail.com> wrote:

On Fri, Sep 19, 2025 at 7:29 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
On Fri, Sep 19, 2025 at 3:04 PM shveta malik <shveta.malik@gmail.com> wrote:
On Thu, Sep 18, 2025 at 5:20 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
Hi Ajin,

On Thu, Sep 18, 2025 at 4:16 PM Ajin Cherian <itsajin@gmail.com> wrote:

On Fri, Sep 12, 2025 at 1:56 PM shveta malik <shveta.malik@gmail.com> wrote:

The approach seems valid and should work, but introducing a new file
like promote.inprogress for this purpose might be excessive. We can
first try analyzing existing information to determine whether we can
distinguish between the two scenarios -- a primary in crash recovery
immediately after a promotion attempt versus a regular primary. If we
are unable to find any way, we can revisit the idea.

I needed a way to reset slots not only during promotion, but also
after a crash that occurs while slots are being reset, so there would
be a fallback mechanism to clear them again on startup. As Shveta
pointed out, it wasn’t trivial to tell apart a standby restarting
after crashing during promotion from a primary restarting after a
crash. So I decided to just reset slots every time primary (or a
standby after promotion) restarts.

Because this fallback logic will run on every primary restart, it was
important to minimize overhead added by the patch. After some
discussion, I placed the reset logic in RestoreSlotFromDisk(), which
is invoked by StartupReplicationSlots() whenever the server starts.
Because RestoreSlotFromDisk() already loops through all slots, this
adds minimum extra work; but also ensures the synced flag is cleared
when running on a primary.

The next challenge was finding a reliable flag to distinguish
primaries from standbys, since we really don’t want to reset the flag
on a standby. I tested StandbyMode, RecoveryInProgress(), and
InRecovery. But during restarts, both RecoveryInProgress() and
InRecovery are always true on both primary and standby. In all my
testing, StandbyMode was the only variable that consistently
differentiated between the two, which is what I used.
+/*
+ * ResetSyncedSlots()
+ *
+ * Reset all replication slots that have synced=true to synced=false.
+ */
I feel this is not correct, we are force resetting sync flag status
for all logical slots, not just the one that is set to true.

--

@@ -2664,6 +2715,10 @@ RestoreSlotFromDisk(const char *name)
memcpy(&slot->data, &cp.slotdata,
sizeof(ReplicationSlotPersistentData));
+ /* reset synced flag if this is a primary server */
+ if (!StandbyMode)
+ slot->data.synced = false;
+
I think you also need to ensure that you are only doing this for the
logical slots, it's currently just checking if the slot is in-use or
not.
I think a better approach would be to reset synced only if it is
marked as synced. Adding a LogicalSlot check wouldn't be incorrect,
but IMO, it may not be necessary.
Thinking further on this, I believe it’s fine even if the slot is
forcefully set to false on the primary. The slot type or its sync
status doesn’t really matter in this case.
I think we need to mark the slots dirty once we set its synced-flag to
false. In such a case, we should only mark those which actually needed
synced flag resetting. IMO, we need a 'synced' flag check here.

Fair enough, point taken.

--
With Regards,
Ashutosh Sharma.

#19

itsajin@gmail.com

4 months ago

In reply to: shveta malik (#14)

1 attachment(s)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Fri, Sep 19, 2025 at 7:26 PM shveta malik <shveta.malik@gmail.com> wrote:

Please find a few comments:

1)
+ * Reset all replication slots that have synced=true to synced=false.

Can we please change it to:
Reset the synced flag to false for all replication slots where it is
currently true.

Changed.

2)
I was wondering that since we reset the sync flag everytime we load
slots from disk , then do we even need ResetSyncedSlots() during
promotion? But I guess we still need it because even after promotion
(if not restarted), existing backend sessions stay alive and it makes
sense if they too see 'synced' as false after promotion. Is it worth
adding this in comments atop ResetSyncedSlots() call during promotion?

Yes, on promotion a server could run for a long time without ever
being restarted. Do we want slots to remain in the synced state until
the next restart? I think not. I think it's best to have the logic
both in the promotion path and in the restart path.
Added a comment in xlog.c for this.

3)
+ if (!StandbyMode)
+ slot->data.synced = false;
a)
Do we need to mark the slot as dirty so it gets saved to disk on the
next chance?

I think ReplicationSlotSave can be skipped, as it may not be
appropriate in the restore flow. But marking the slot dirty is
important to avoid resetting the sync flag again on the next startup.
A crash between marking it dirty and persisting it would still require
a reset, but that seems acceptable. Thoughts?

b)
Also if we are marking it dirty, it makes sense to set synced to false
only after checking if synced is true already.

Changed this accordingly. Marking it as dirty also required acquiring
the slot and releasing it afterwards. I've also not checked for
logical slots, as if the synced flag is set, it should only be for
logical slots.

Attaching v3 with the above code changes.

regards,
Ajin Cherian
Fujitsu Australia

Attachments:

v3-0001-Reset-synced-slots-when-a-standby-is-promoted.patchapplication/octet-stream; name=v3-0001-Reset-synced-slots-when-a-standby-is-promoted.patchDownload

From 525a9e94fa6ca652f9a79eb2c914382cf527520e Mon Sep 17 00:00:00 2001
From: Ajin Cherian <itsajin@gmail.com>
Date: Tue, 23 Sep 2025 16:54:17 +1000
Subject: [PATCH v3] Reset synced slots when a standby is promoted.

On promotion, reset any slots which have the 'synced' flag set so that
the primary starts with synced flag set false. This ensures consistent
behavior across all switchovers. Also handle the possibility of server
crashing before all slots are reset by reseting slots on primary on a
restart.
---
 doc/src/sgml/system-views.sgml                |  3 +-
 src/backend/access/transam/xlog.c             | 18 +++--
 src/backend/access/transam/xlogrecovery.c     |  9 ---
 src/backend/replication/slot.c                | 68 +++++++++++++++++++
 src/include/replication/slot.h                |  1 +
 .../t/040_standby_failover_slots_sync.pl      |  6 +-
 6 files changed, 86 insertions(+), 19 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 4187191ea74..ff9384127cd 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -3031,8 +3031,7 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        On a hot standby, the slots with the synced column marked as true can
        neither be used for logical decoding nor dropped manually. The value
        of this column has no meaning on the primary server; the column value on
-       the primary is default false for all slots but may (if leftover from a
-       promoted standby) also be true.
+       the primary is false for all slots.
       </para></entry>
      </row>
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index eac1de75ed0..5ebb74888b0 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5639,7 +5639,8 @@ StartupXLOG(void)
 
 	/*
 	 * Initialize replication slots, before there's a chance to remove
-	 * required resources.
+	 * required resources. Clear any leftover 'synced' flags on replication
+	 * slots when on the primary.
 	 */
 	StartupReplicationSlots();
 
@@ -6241,13 +6242,20 @@ StartupXLOG(void)
 	WalSndWakeup(true, true);
 
 	/*
-	 * If this was a promotion, request an (online) checkpoint now. This isn't
-	 * required for consistency, but the last restartpoint might be far back,
-	 * and in case of a crash, recovering from it might take a longer than is
-	 * appropriate now that we're not in standby mode anymore.
+	 * If this was a promotion, first reset any slots that had been marked as
+	 * synced during standby mode. Although slots that are marked as synced
+	 * are reset on a restart of the primary, we need to do it in the promotion
+	 * path as it could be some time before the next restart.
+	 * Then request an (online) checkpoint. The checkpoint isn't required for
+	 * consistency, but the last restartpoint might be far back, and in case
+	 * of a crash, recovery could take longer than desirable now that we're not
+	 * in standby mode anymore.
 	 */
 	if (promoted)
+	{
+		ResetSyncedSlots();
 		RequestCheckpoint(CHECKPOINT_FORCE);
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 346319338a0..37ad309201e 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -1481,15 +1481,6 @@ FinishWalRecovery(void)
 	/*
 	 * Shutdown the slot sync worker to drop any temporary slots acquired by
 	 * it and to prevent it from keep trying to fetch the failover slots.
-	 *
-	 * We do not update the 'synced' column in 'pg_replication_slots' system
-	 * view from true to false here, as any failed update could leave 'synced'
-	 * column false for some slots. This could cause issues during slot sync
-	 * after restarting the server as a standby. While updating the 'synced'
-	 * column after switching to the new timeline is an option, it does not
-	 * simplify the handling for the 'synced' column. Therefore, we retain the
-	 * 'synced' column as true after promotion as it may provide useful
-	 * information about the slot origin.
 	 */
 	ShutDownSlotSync();
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fd0fdb96d42..1a15ae32226 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -852,6 +852,58 @@ restart:
 	LWLockRelease(ReplicationSlotControlLock);
 }
 
+/*
+ * ResetSyncedSlots()
+ *
+ * Reset the synced flag to false for all replication slots where it is
+ * currently true.
+ */
+void
+ResetSyncedSlots(void)
+{
+	int			i;
+
+	/*
+	 * Iterate through all replication slot entries and reset synced ones
+	 */
+	for (i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+		/* Skip inactive/unused slots */
+		if (!s->in_use)
+			continue;
+
+		/* we're only interested in logical slots */
+		if (!SlotIsLogical(s))
+			continue;
+
+		/* Check if this slot was marked as synced */
+		if (s->data.synced)
+		{
+			/* Acquire the slot */
+			ReplicationSlotAcquire(NameStr(s->data.name), false, true);
+
+			/* Reset the synced flag under spinlock protection */
+			SpinLockAcquire(&s->mutex);
+			s->data.synced = false;
+			SpinLockRelease(&s->mutex);
+
+			/* Mark dirty and save outside the spinlock */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+
+			ereport(LOG,
+				(errmsg("reset synced flag for replication slot \"%s\"",
+					NameStr(s->data.name))));
+
+			/* Release the slot */
+			ReplicationSlotRelease();
+		}
+	}
+
+}
+
 /*
  * Permanently drop replication slot identified by the passed in name.
  */
@@ -2690,6 +2742,22 @@ RestoreSlotFromDisk(const char *name)
 		ReplicationSlotSetInactiveSince(slot, now, false);
 
 		restored = true;
+
+		/*
+		 * A primary should never have a slot with the 'synced' flag set.
+		 * Even if this server was previously a standby, the flag should
+		 * have been cleared during promotion. The only case it may still
+		 * be set is if the server crashed during promotion. In that case,
+		 * reset it now and mark the slot dirty.
+		 */
+		if (!StandbyMode && slot->data.synced)
+		{
+			ReplicationSlotAcquire(NameStr(slot->data.name), false, true);
+			slot->data.synced = false;
+			ReplicationSlotMarkDirty();
+			ReplicationSlotRelease();
+		}
+
 		break;
 	}
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index fe62162cde3..7902d51781d 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -336,6 +336,7 @@ extern int	ReplicationSlotIndex(ReplicationSlot *slot);
 extern bool ReplicationSlotName(int index, Name name);
 extern void ReplicationSlotNameForTablesync(Oid suboid, Oid relid, char *syncslotname, Size szslot);
 extern void ReplicationSlotDropAtPubNode(WalReceiverConn *wrconn, char *slotname, bool missing_ok);
+extern void ResetSyncedSlots(void);
 
 extern void StartupReplicationSlots(void);
 extern void CheckPointReplicationSlots(bool is_shutdown);
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index 2c61c51e914..0f225aa09c1 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -932,13 +932,13 @@ my $standby1_conninfo = $standby1->connstr . ' dbname=postgres';
 $subscriber1->safe_psql('postgres',
 	"ALTER SUBSCRIPTION regress_mysub1 CONNECTION '$standby1_conninfo';");
 
-# Confirm the synced slot 'lsub1_slot' is retained on the new primary
+# Confirm the synced slot 'lsub1_slot' is reset on the new primary
 is( $standby1->safe_psql(
 		'postgres',
 		q{SELECT count(*) = 2 FROM pg_replication_slots WHERE slot_name IN ('lsub1_slot', 'snap_test_slot') AND synced AND NOT temporary;}
 	),
-	't',
-	'synced slot retained on the new primary');
+	'f',
+	'synced slot reset on the new primary');
 
 # Commit the prepared transaction
 $standby1->safe_psql('postgres', "COMMIT PREPARED 'test_twophase_slotsync';");
-- 
2.47.3

#20

shveta.malik@gmail.com

4 months ago

In reply to: Ajin Cherian (#19)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Tue, Sep 23, 2025 at 1:11 PM Ajin Cherian <itsajin@gmail.com> wrote:

Attaching v3 with the above code changes.

Thanks for the patch. Please find a few comments:

1)
+ if (!StandbyMode && slot->data.synced)
+ {
+ ReplicationSlotAcquire(NameStr(slot->data.name), false, true);
+ slot->data.synced = false;
+ ReplicationSlotMarkDirty();
+ ReplicationSlotRelease();
+ }

Do we really need to acquire in this startup flow? Can we directly
set the 'just_dirtied' and 'dirty' as true without calling
ReplicationSlotMarkDirty(). IMO, there is no possibility of another
session connection at this point right as we have not started up
completely, and thus it should be okay to directly mark it as dirty
without acquiring slot or taking locks.

2)
+ * If this was a promotion, first reset any slots that had been marked as
+ * synced during standby mode. Although slots that are marked as synced
+ * are reset on a restart of the primary, we need to do it in the promotion
+ * path as it could be some time before the next restart.

Shall we rephrase to:
If this was a promotion, first reset the synced flag for any logical
slots if it's set. Although the synced flag for logical slots is reset
on every primary restart, we also need to handle it during promotion
since existing backend sessions remain active even after promotion,
and a restart may not happen for some time.

3)
+ ereport(LOG,
+ (errmsg("reset synced flag for replication slot \"%s\"",
+ NameStr(s->data.name))));

a) Shall we change it to DEBUG1?
b) Shall the msg be:
synced flag reset for replication slot \"%s\" during promotion

4)
Regarding:
-# Confirm the synced slot 'lsub1_slot' is retained on the new primary
+# Confirm the synced slot 'lsub1_slot' is reset on the new primary
is( $standby1->safe_psql(
'postgres',
q{SELECT count(*) = 2 FROM pg_replication_slots WHERE slot_name IN
('lsub1_slot', 'snap_test_slot') AND synced AND NOT temporary;}
),
- 't',
- 'synced slot retained on the new primary');
+ 'f',
+ 'synced slot reset on the new primary');

I think the original test was trying to confirm that both the logical
slots are retained after promotion. See test-title:
# Promote the standby1 to primary. Confirm that:
# a) the slot 'lsub1_slot' and 'snap_test_slot' are retained on the new primary

But with this change, it will not be able to verify that. Shall we
modify the test to say 'Confirm that the slots 'lsub1_slot' and
'snap_test_slot' are retained on the new primary and synced flag is
cleared' and change the query to have 'NOT synced'. Thoughts?

thanks
Shveta

#21

ashu.coek88@gmail.com

4 months ago

In reply to: shveta malik (#20)

Re: Clear logical slot's 'synced' flag on promotion of standby

3)
+ ereport(LOG,
+ (errmsg("reset synced flag for replication slot \"%s\"",
+ NameStr(s->data.name))));
a) Shall we change it to DEBUG1?
b) Shall the msg be:
synced flag reset for replication slot \"%s\" during promotion

I think this can stay as a LOG message, it only runs once at startup
and applies just to logical slots, so it won’t be noisy. I’d also
avoid mentioning “during promotion,” since the flag might accidentally
be set on the primary and then reset later during startup, making that
description inaccurate.

--
With Regards,
Ashutosh Sharma.

#22

itsajin@gmail.com

4 months ago

In reply to: Ashutosh Sharma (#21)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Tue, Sep 23, 2025 at 11:11 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

3)
+ ereport(LOG,
+ (errmsg("reset synced flag for replication slot \"%s\"",
+ NameStr(s->data.name))));
a) Shall we change it to DEBUG1?
b) Shall the msg be:
synced flag reset for replication slot \"%s\" during promotion
I think this can stay as a LOG message, it only runs once at startup
and applies just to logical slots, so it won’t be noisy. I’d also
avoid mentioning “during promotion,” since the flag might accidentally
be set on the primary and then reset later during startup, making that
description inaccurate.

Oops! that was actually leftover debug logs that I used in my testing
but I am happy to leave it as DEBUG1 or LOG1 as necessary.

regards,
Ajin Cherian
Fujitsu Australia

#23

shveta.malik@gmail.com

4 months ago

In reply to: Ashutosh Sharma (#21)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Tue, Sep 23, 2025 at 6:41 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

3)
+ ereport(LOG,
+ (errmsg("reset synced flag for replication slot \"%s\"",
+ NameStr(s->data.name))));
a) Shall we change it to DEBUG1?
b) Shall the msg be:
synced flag reset for replication slot \"%s\" during promotion
I think this can stay as a LOG message, it only runs once at startup
and applies just to logical slots, so it won’t be noisy.

It does not run at startup, it runs during promotion. Having said
that, if there are a lot many slots, and we have one message per slot,
overall logs can still be more. I somehow find DEBUG better here. But
we can leave it as LOG and can revisit later when others review.

I’d also
avoid mentioning “during promotion,” since the flag might accidentally
be set on the primary and then reset later during startup, making that
description inaccurate.

This function is called only during promotion (see check 'if
(promoted)') and thus the suggested message ( “during promotion")
seems better to me.

Ajin, we can add comments atop ResetSyncedSlots() that this function
is invoked/used only during promotion, making the usage more clear.

thanks
Shveta

#24

ashu.coek88@gmail.com

4 months ago

In reply to: shveta malik (#23)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Wed, Sep 24, 2025 at 9:42 AM shveta malik <shveta.malik@gmail.com> wrote:

On Tue, Sep 23, 2025 at 6:41 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
3)
+ ereport(LOG,
+ (errmsg("reset synced flag for replication slot \"%s\"",
+ NameStr(s->data.name))));
a) Shall we change it to DEBUG1?
b) Shall the msg be:
synced flag reset for replication slot \"%s\" during promotion
I think this can stay as a LOG message, it only runs once at startup
and applies just to logical slots, so it won’t be noisy.
It does not run at startup, it runs during promotion. Having said
that, if there are a lot many slots, and we have one message per slot,
overall logs can still be more. I somehow find DEBUG better here. But
we can leave it as LOG and can revisit later when others review.

I’d also
avoid mentioning “during promotion,” since the flag might accidentally
be set on the primary and then reset later during startup, making that
description inaccurate.

This function is called only during promotion (see check 'if
(promoted)') and thus the suggested message ( “during promotion")
seems better to me.

ResetSyncedSlots might be called during promotion, but
RestoreSlotFromDisk (which runs during standard PostgreSQL startup)
performs the same functionality. This creates a scenario where the
sync flag could be reset during either promotion or regular startup. I
think we should either remove it, or ensure it is present in both
places for consistency.

--
With Regards,
Ashutosh Sharma.

#25

itsajin@gmail.com

4 months ago

In reply to: shveta malik (#20)

1 attachment(s)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Tue, Sep 23, 2025 at 7:54 PM shveta malik <shveta.malik@gmail.com> wrote:

On Tue, Sep 23, 2025 at 1:11 PM Ajin Cherian <itsajin@gmail.com> wrote:

Attaching v3 with the above code changes.

Thanks for the patch. Please find a few comments:
1)
+ if (!StandbyMode && slot->data.synced)
+ {
+ ReplicationSlotAcquire(NameStr(slot->data.name), false, true);
+ slot->data.synced = false;
+ ReplicationSlotMarkDirty();
+ ReplicationSlotRelease();
+ }
Do we really need to acquire in this startup flow? Can we directly
set the 'just_dirtied' and 'dirty' as true without calling
ReplicationSlotMarkDirty(). IMO, there is no possibility of another
session connection at this point right as we have not started up
completely, and thus it should be okay to directly mark it as dirty
without acquiring slot or taking locks.

I've modified it accordingly.

2)
+ * If this was a promotion, first reset any slots that had been marked as
+ * synced during standby mode. Although slots that are marked as synced
+ * are reset on a restart of the primary, we need to do it in the promotion
+ * path as it could be some time before the next restart.
Shall we rephrase to:
If this was a promotion, first reset the synced flag for any logical
slots if it's set. Although the synced flag for logical slots is reset
on every primary restart, we also need to handle it during promotion
since existing backend sessions remain active even after promotion,
and a restart may not happen for some time.

Changed.

3)
+ ereport(LOG,
+ (errmsg("reset synced flag for replication slot \"%s\"",
+ NameStr(s->data.name))));
a) Shall we change it to DEBUG1?
b) Shall the msg be:
synced flag reset for replication slot \"%s\" during promotion

Changed to DEBUG1

4)
Regarding:
-# Confirm the synced slot 'lsub1_slot' is retained on the new primary
+# Confirm the synced slot 'lsub1_slot' is reset on the new primary
is( $standby1->safe_psql(
'postgres',
q{SELECT count(*) = 2 FROM pg_replication_slots WHERE slot_name IN
('lsub1_slot', 'snap_test_slot') AND synced AND NOT temporary;}
),
- 't',
- 'synced slot retained on the new primary');
+ 'f',
+ 'synced slot reset on the new primary');
I think the original test was trying to confirm that both the logical
slots are retained after promotion. See test-title:
# Promote the standby1 to primary. Confirm that:
# a) the slot 'lsub1_slot' and 'snap_test_slot' are retained on the new primary

But with this change, it will not be able to verify that. Shall we
modify the test to say 'Confirm that the slots 'lsub1_slot' and
'snap_test_slot' are retained on the new primary and synced flag is
cleared' and change the query to have 'NOT synced'. Thoughts?

Changed.

Attaching v4 which addresses all the above comments.

regards,
Ajin Cherian
Fujitsu Australia

Attachments:

v4-0001-Reset-synced-slots-when-a-standby-is-promoted.patchapplication/octet-stream; name=v4-0001-Reset-synced-slots-when-a-standby-is-promoted.patchDownload

From 99ccd9ac9c712cbcd59a0ddbf58badcfecdd575c Mon Sep 17 00:00:00 2001
From: Ajin Cherian <itsajin@gmail.com>
Date: Fri, 26 Sep 2025 19:50:49 +1000
Subject: [PATCH v4] Reset synced slots when a standby is promoted.

On promotion, reset any slots which have the 'synced' flag set so that
the primary starts with synced flag set false. This ensures consistent
behavior across all switchovers. Also handle the possibility of server
crashing before all slots are reset by reseting slots on primary on a
restart.
---
 doc/src/sgml/system-views.sgml                |  3 +-
 src/backend/access/transam/xlog.c             | 19 ++++--
 src/backend/access/transam/xlogrecovery.c     |  9 ---
 src/backend/replication/slot.c                | 68 +++++++++++++++++++
 src/include/replication/slot.h                |  1 +
 .../t/040_standby_failover_slots_sync.pl      |  3 +-
 6 files changed, 86 insertions(+), 17 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 4187191ea74..ff9384127cd 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -3031,8 +3031,7 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        On a hot standby, the slots with the synced column marked as true can
        neither be used for logical decoding nor dropped manually. The value
        of this column has no meaning on the primary server; the column value on
-       the primary is default false for all slots but may (if leftover from a
-       promoted standby) also be true.
+       the primary is false for all slots.
       </para></entry>
      </row>
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 109713315c0..20fdfa489e8 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5639,7 +5639,8 @@ StartupXLOG(void)
 
 	/*
 	 * Initialize replication slots, before there's a chance to remove
-	 * required resources.
+	 * required resources. Clear any leftover 'synced' flags on replication
+	 * slots when on the primary.
 	 */
 	StartupReplicationSlots();
 
@@ -6241,13 +6242,21 @@ StartupXLOG(void)
 	WalSndWakeup(true, true);
 
 	/*
-	 * If this was a promotion, request an (online) checkpoint now. This isn't
-	 * required for consistency, but the last restartpoint might be far back,
-	 * and in case of a crash, recovering from it might take a longer than is
-	 * appropriate now that we're not in standby mode anymore.
+	 * If this was a promotion, first reset the synced flag for any logical
+	 * slots if it's set. Although the synced flag for logical slots is reset
+	 * on every primary restart, we also need to handle it during promotion
+	 * since existing backend sessions remain active even after promotion,
+	 * and a restart may not happen for some time.
+	 * Then request an (online) checkpoint. The checkpoint isn't required for
+	 * consistency, but the last restartpoint might be far back, and in case
+	 * of a crash, recovery could take longer than desirable now that we're not
+	 * in standby mode anymore.
 	 */
 	if (promoted)
+	{
+		ResetSyncedSlots();
 		RequestCheckpoint(CHECKPOINT_FORCE);
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 52ff4d119e6..6e975c12a97 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -1482,15 +1482,6 @@ FinishWalRecovery(void)
 	/*
 	 * Shutdown the slot sync worker to drop any temporary slots acquired by
 	 * it and to prevent it from keep trying to fetch the failover slots.
-	 *
-	 * We do not update the 'synced' column in 'pg_replication_slots' system
-	 * view from true to false here, as any failed update could leave 'synced'
-	 * column false for some slots. This could cause issues during slot sync
-	 * after restarting the server as a standby. While updating the 'synced'
-	 * column after switching to the new timeline is an option, it does not
-	 * simplify the handling for the 'synced' column. Therefore, we retain the
-	 * 'synced' column as true after promotion as it may provide useful
-	 * information about the slot origin.
 	 */
 	ShutDownSlotSync();
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fd0fdb96d42..460fba505d1 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -852,6 +852,59 @@ restart:
 	LWLockRelease(ReplicationSlotControlLock);
 }
 
+/*
+ * ResetSyncedSlots()
+ *
+ * Reset the synced flag to false for all replication slots where it is
+ * currently true. Currently this function is only invoked during promotion.
+ */
+void
+ResetSyncedSlots(void)
+{
+	int			i;
+
+	/*
+	 * Iterate through all replication slot entries and reset synced ones
+	 */
+	for (i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+		/* Skip inactive/unused slots */
+		if (!s->in_use)
+			continue;
+
+		/* we're only interested in logical slots */
+		if (!SlotIsLogical(s))
+			continue;
+
+		/* Check if this slot was marked as synced */
+		if (s->data.synced)
+		{
+			/* Acquire the slot */
+			ReplicationSlotAcquire(NameStr(s->data.name), false, true);
+
+			/* Reset the synced flag under spinlock protection */
+			SpinLockAcquire(&s->mutex);
+			s->data.synced = false;
+			SpinLockRelease(&s->mutex);
+
+			/* Mark dirty and save outside the spinlock */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+
+			ereport(DEBUG1,
+				(errmsg("synced flag reset for replication slot \"%s\""
+						" during promotion",
+						NameStr(s->data.name))));
+
+			/* Release the slot */
+			ReplicationSlotRelease();
+		}
+	}
+
+}
+
 /*
  * Permanently drop replication slot identified by the passed in name.
  */
@@ -2690,6 +2743,21 @@ RestoreSlotFromDisk(const char *name)
 		ReplicationSlotSetInactiveSince(slot, now, false);
 
 		restored = true;
+
+		/*
+		 * A primary should never have a slot with the 'synced' flag set.
+		 * Even if this server was previously a standby, the flag should
+		 * have been cleared during promotion. The only case it may still
+		 * be set is if the server crashed during promotion. In that case,
+		 * reset it now and mark the slot dirty.
+		 */
+		if (!StandbyMode && slot->data.synced)
+		{
+			slot->data.synced = false;
+			slot->just_dirtied = true;
+			slot->dirty = true;
+		}
+
 		break;
 	}
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index fe62162cde3..7902d51781d 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -336,6 +336,7 @@ extern int	ReplicationSlotIndex(ReplicationSlot *slot);
 extern bool ReplicationSlotName(int index, Name name);
 extern void ReplicationSlotNameForTablesync(Oid suboid, Oid relid, char *syncslotname, Size szslot);
 extern void ReplicationSlotDropAtPubNode(WalReceiverConn *wrconn, char *slotname, bool missing_ok);
+extern void ResetSyncedSlots(void);
 
 extern void StartupReplicationSlots(void);
 extern void CheckPointReplicationSlots(bool is_shutdown);
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index 2c61c51e914..d7bb73d81a7 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -935,7 +935,8 @@ $subscriber1->safe_psql('postgres',
 # Confirm the synced slot 'lsub1_slot' is retained on the new primary
 is( $standby1->safe_psql(
 		'postgres',
-		q{SELECT count(*) = 2 FROM pg_replication_slots WHERE slot_name IN ('lsub1_slot', 'snap_test_slot') AND synced AND NOT temporary;}
+		q{SELECT count(*) = 2 FROM pg_replication_slots WHERE slot_name IN ('lsub1_slot', 'snap_test_slot') AND NOT synced AND NOT temporary;}
+
 	),
 	't',
 	'synced slot retained on the new primary');
-- 
2.47.3

#26

shveta.malik@gmail.com

4 months ago

In reply to: Ashutosh Sharma (#24)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Wed, Sep 24, 2025 at 10:18 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

On Wed, Sep 24, 2025 at 9:42 AM shveta malik <shveta.malik@gmail.com> wrote:
On Tue, Sep 23, 2025 at 6:41 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
3)
+ ereport(LOG,
+ (errmsg("reset synced flag for replication slot \"%s\"",
+ NameStr(s->data.name))));
a) Shall we change it to DEBUG1?
b) Shall the msg be:
synced flag reset for replication slot \"%s\" during promotion
I think this can stay as a LOG message, it only runs once at startup
and applies just to logical slots, so it won’t be noisy.
It does not run at startup, it runs during promotion. Having said
that, if there are a lot many slots, and we have one message per slot,
overall logs can still be more. I somehow find DEBUG better here. But
we can leave it as LOG and can revisit later when others review.

I’d also
avoid mentioning “during promotion,” since the flag might accidentally
be set on the primary and then reset later during startup, making that
description inaccurate.

This function is called only during promotion (see check 'if
(promoted)') and thus the suggested message ( “during promotion")
seems better to me.
ResetSyncedSlots might be called during promotion, but
RestoreSlotFromDisk (which runs during standard PostgreSQL startup)
performs the same functionality. This creates a scenario where the
sync flag could be reset during either promotion or regular startup. I
think we should either remove it, or ensure it is present in both
places for consistency.

Sorry I missed this email somehow. Yes, I agree. We can have DEBUG1 at
both places. But let's see what others think on this.

thanks
Shveta

#27

shveta.malik@gmail.com

4 months ago

In reply to: Ajin Cherian (#25)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Fri, Sep 26, 2025 at 3:26 PM Ajin Cherian <itsajin@gmail.com> wrote:

Attaching v4 which addresses all the above comments.

Few trivial comments:

1)
 # Confirm the synced slot 'lsub1_slot' is retained on the new primary
 is( $standby1->safe_psql(
  'postgres',
- q{SELECT count(*) = 2 FROM pg_replication_slots WHERE slot_name IN
('lsub1_slot', 'snap_test_slot') AND synced AND NOT temporary;}
+ q{SELECT count(*) = 2 FROM pg_replication_slots WHERE slot_name IN
('lsub1_slot', 'snap_test_slot') AND NOT synced AND NOT temporary;}
+
  ),
  't',
  'synced slot retained on the new primary');

a)
It is not fault of this patch, but I see comment and query not
matching. We shall have both the names 'lsub1_slot', 'snap_test_slot'
in comment.

b) Also it will be good to mention the expectation from synced flag in
the comment. How about:

Confirm the synced slots 'lsub1_slot' and 'snap_test_slot' are
retained on the new primary and 'synced' flag is cleared on promotion.

2)
As Ashutosh suggested, even in RestoreSlotFromDisk(), we can have
DEBUG1 msg: "synced flag reset for replication slot \"%s\""

thanks
Shveta

#28

[1]: /messages/by-id/CAA5-nLAqGpBFEAr2XNYMj3E+39caQra_SJeB5MCtp7PCyLTiOg@mail.gmail.com

itsajin@gmail.com

3 months ago

In reply to: shveta malik (#27)

1 attachment(s)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Mon, Sep 29, 2025 at 4:11 PM shveta malik <shveta.malik@gmail.com> wrote:

On Fri, Sep 26, 2025 at 3:26 PM Ajin Cherian <itsajin@gmail.com> wrote:

Attaching v4 which addresses all the above comments.

Few trivial comments:
1)
# Confirm the synced slot 'lsub1_slot' is retained on the new primary
is( $standby1->safe_psql(
'postgres',
- q{SELECT count(*) = 2 FROM pg_replication_slots WHERE slot_name IN
('lsub1_slot', 'snap_test_slot') AND synced AND NOT temporary;}
+ q{SELECT count(*) = 2 FROM pg_replication_slots WHERE slot_name IN
('lsub1_slot', 'snap_test_slot') AND NOT synced AND NOT temporary;}
+
),
't',
'synced slot retained on the new primary');
a)
It is not fault of this patch, but I see comment and query not
matching. We shall have both the names 'lsub1_slot', 'snap_test_slot'
in comment.

Fixed.

b) Also it will be good to mention the expectation from synced flag in
the comment. How about:

Confirm the synced slots 'lsub1_slot' and 'snap_test_slot' are
retained on the new primary and 'synced' flag is cleared on promotion.

Added.

2)
As Ashutosh suggested, even in RestoreSlotFromDisk(), we can have
DEBUG1 msg: "synced flag reset for replication slot \"%s\""

Added.

Attaching patch v5 addressing the above changes.

regards,
Ajin Cherian
Fujitsu Australia

Attachments:

v5-0001-Reset-synced-slots-when-a-standby-is-promoted.patchapplication/octet-stream; name=v5-0001-Reset-synced-slots-when-a-standby-is-promoted.patchDownload

From de4deb80756e464cd0c732c4bc2d415d3c5a5074 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <itsajin@gmail.com>
Date: Fri, 3 Oct 2025 19:14:06 +1000
Subject: [PATCH v5] Reset synced slots when a standby is promoted.

On promotion, reset any slots which have the 'synced' flag set so that
the primary starts with synced flag set false. This ensures consistent
behavior across all switchovers. Also handle the possibility of server
crashing before all slots are reset by reseting slots on primary on a
restart.
---
 doc/src/sgml/system-views.sgml                |  3 +-
 src/backend/access/transam/xlog.c             | 19 +++--
 src/backend/access/transam/xlogrecovery.c     |  9 ---
 src/backend/replication/slot.c                | 72 +++++++++++++++++++
 src/include/replication/slot.h                |  1 +
 .../t/040_standby_failover_slots_sync.pl      |  6 +-
 6 files changed, 92 insertions(+), 18 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 4187191ea74..ff9384127cd 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -3031,8 +3031,7 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        On a hot standby, the slots with the synced column marked as true can
        neither be used for logical decoding nor dropped manually. The value
        of this column has no meaning on the primary server; the column value on
-       the primary is default false for all slots but may (if leftover from a
-       promoted standby) also be true.
+       the primary is false for all slots.
       </para></entry>
      </row>
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index eceab341255..02106da3108 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5642,7 +5642,8 @@ StartupXLOG(void)
 
 	/*
 	 * Initialize replication slots, before there's a chance to remove
-	 * required resources.
+	 * required resources. Clear any leftover 'synced' flags on replication
+	 * slots when on the primary.
 	 */
 	StartupReplicationSlots();
 
@@ -6244,13 +6245,21 @@ StartupXLOG(void)
 	WalSndWakeup(true, true);
 
 	/*
-	 * If this was a promotion, request an (online) checkpoint now. This isn't
-	 * required for consistency, but the last restartpoint might be far back,
-	 * and in case of a crash, recovering from it might take a longer than is
-	 * appropriate now that we're not in standby mode anymore.
+	 * If this was a promotion, first reset the synced flag for any logical
+	 * slots if it's set. Although the synced flag for logical slots is reset
+	 * on every primary restart, we also need to handle it during promotion
+	 * since existing backend sessions remain active even after promotion,
+	 * and a restart may not happen for some time.
+	 * Then request an (online) checkpoint. The checkpoint isn't required for
+	 * consistency, but the last restartpoint might be far back, and in case
+	 * of a crash, recovery could take longer than desirable now that we're not
+	 * in standby mode anymore.
 	 */
 	if (promoted)
+	{
+		ResetSyncedSlots();
 		RequestCheckpoint(CHECKPOINT_FORCE);
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 52ff4d119e6..6e975c12a97 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -1482,15 +1482,6 @@ FinishWalRecovery(void)
 	/*
 	 * Shutdown the slot sync worker to drop any temporary slots acquired by
 	 * it and to prevent it from keep trying to fetch the failover slots.
-	 *
-	 * We do not update the 'synced' column in 'pg_replication_slots' system
-	 * view from true to false here, as any failed update could leave 'synced'
-	 * column false for some slots. This could cause issues during slot sync
-	 * after restarting the server as a standby. While updating the 'synced'
-	 * column after switching to the new timeline is an option, it does not
-	 * simplify the handling for the 'synced' column. Therefore, we retain the
-	 * 'synced' column as true after promotion as it may provide useful
-	 * information about the slot origin.
 	 */
 	ShutDownSlotSync();
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fd0fdb96d42..2e9f286ec07 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -852,6 +852,59 @@ restart:
 	LWLockRelease(ReplicationSlotControlLock);
 }
 
+/*
+ * ResetSyncedSlots()
+ *
+ * Reset the synced flag to false for all replication slots where it is
+ * currently true. Currently this function is only invoked during promotion.
+ */
+void
+ResetSyncedSlots(void)
+{
+	int			i;
+
+	/*
+	 * Iterate through all replication slot entries and reset synced ones
+	 */
+	for (i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+		/* Skip inactive/unused slots */
+		if (!s->in_use)
+			continue;
+
+		/* we're only interested in logical slots */
+		if (!SlotIsLogical(s))
+			continue;
+
+		/* Check if this slot was marked as synced */
+		if (s->data.synced)
+		{
+			/* Acquire the slot */
+			ReplicationSlotAcquire(NameStr(s->data.name), false, true);
+
+			/* Reset the synced flag under spinlock protection */
+			SpinLockAcquire(&s->mutex);
+			s->data.synced = false;
+			SpinLockRelease(&s->mutex);
+
+			/* Mark dirty and save outside the spinlock */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+
+			ereport(DEBUG1,
+				(errmsg("synced flag reset for replication slot \"%s\""
+						" during promotion",
+						NameStr(s->data.name))));
+
+			/* Release the slot */
+			ReplicationSlotRelease();
+		}
+	}
+
+}
+
 /*
  * Permanently drop replication slot identified by the passed in name.
  */
@@ -2690,6 +2743,25 @@ RestoreSlotFromDisk(const char *name)
 		ReplicationSlotSetInactiveSince(slot, now, false);
 
 		restored = true;
+
+		/*
+		 * A primary should never have a slot with the 'synced' flag set.
+		 * Even if this server was previously a standby, the flag should
+		 * have been cleared during promotion. The only case it may still
+		 * be set is if the server crashed or failed during promotion before
+		 * the flag could be reset.
+		 * In that case, reset it now and mark the slot dirty.
+		 */
+		if (!StandbyMode && slot->data.synced)
+		{
+			slot->data.synced = false;
+			slot->just_dirtied = true;
+			slot->dirty = true;
+			ereport(DEBUG1,
+					(errmsg("synced flag reset for replication slot \"%s\"",
+						NameStr(slot->data.name))));
+		}
+
 		break;
 	}
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index fe62162cde3..7902d51781d 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -336,6 +336,7 @@ extern int	ReplicationSlotIndex(ReplicationSlot *slot);
 extern bool ReplicationSlotName(int index, Name name);
 extern void ReplicationSlotNameForTablesync(Oid suboid, Oid relid, char *syncslotname, Size szslot);
 extern void ReplicationSlotDropAtPubNode(WalReceiverConn *wrconn, char *slotname, bool missing_ok);
+extern void ResetSyncedSlots(void);
 
 extern void StartupReplicationSlots(void);
 extern void CheckPointReplicationSlots(bool is_shutdown);
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index 2c61c51e914..29a48019eda 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -932,10 +932,12 @@ my $standby1_conninfo = $standby1->connstr . ' dbname=postgres';
 $subscriber1->safe_psql('postgres',
 	"ALTER SUBSCRIPTION regress_mysub1 CONNECTION '$standby1_conninfo';");
 
-# Confirm the synced slot 'lsub1_slot' is retained on the new primary
+# Confirm that the synced slots 'lsub1_slot' and 'snap_test_slot' are retained on the new primary
+# and the synced flag is cleared on promotion.
 is( $standby1->safe_psql(
 		'postgres',
-		q{SELECT count(*) = 2 FROM pg_replication_slots WHERE slot_name IN ('lsub1_slot', 'snap_test_slot') AND synced AND NOT temporary;}
+		q{SELECT count(*) = 2 FROM pg_replication_slots WHERE slot_name IN ('lsub1_slot', 'snap_test_slot') AND NOT synced AND NOT temporary;}
+
 	),
 	't',
 	'synced slot retained on the new primary');
-- 
2.47.3

#29

Masahiko Sawada

sawada.mshk@gmail.com

3 months ago

In reply to: shveta malik (#6)

Re: Clear logical slot's 'synced' flag on promotion of standby

On Wed, Sep 10, 2025 at 9:00 PM shveta malik <shveta.malik@gmail.com> wrote:

On Wed, Sep 10, 2025 at 5:23 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Sep 8, 2025 at 11:21 PM shveta malik <shveta.malik@gmail.com> wrote:

Hi,

This is a spin-off thread from [1].

Currently, in the slot-sync worker, we have an error scenario [2]
where, during slot synchronization, if we detect a slot with the same
name and its synced flag is set to false, we emit an error. The
rationale is to avoid potentially overwriting a user-created slot.

But while analyzing [1], we observed that this error can lead to
inconsistent behavior during switchovers. On the first switchover, the
new standby logs an error: "Exiting from slot synchronization because
a slot with the same name already exists on the standby." But during
a double switchover, this error does not occur.

Upon re-evaluating this, it seems more appropriate to clear the synced
flag after promotion, as the flag does not hold any meaning on the
primary. Doing so would ensure consistent behavior across all
switchovers, as the same error will be raised avoiding the risk of
overwriting user's slots.

There is the following comment in FinishWalRecovery():

/*
* Shutdown the slot sync worker to drop any temporary slots acquired by
* it and to prevent it from keep trying to fetch the failover slots.
*
* We do not update the 'synced' column in 'pg_replication_slots' system
* view from true to false here, as any failed update could leave 'synced'
* column false for some slots. This could cause issues during slot sync
* after restarting the server as a standby. While updating the 'synced'
* column after switching to the new timeline is an option, it does not
* simplify the handling for the 'synced' column. Therefore, we retain the
* 'synced' column as true after promotion as it may provide useful
* information about the slot origin.
*/
ShutDownSlotSync();

Does the patch address the above concerns?

Yes, the patch is attempting to address the above concern. it is
trying to Reset synced-column after switching to a new timeline. There
is an issue though as pointed out by Ashutosh in [1], which needs to
be addressed.

Nice.

There's an ongoing discussion about a patch that would allow users to
overwrite slot properties[1]/messages/by-id/CAA5-nLAqGpBFEAr2XNYMj3E+39caQra_SJeB5MCtp7PCyLTiOg@mail.gmail.com. IIUC, the reported inconsistency during
switchover would be resolved by that slot-overwriting patch. I'm
looking into the relationship between the patch discussed in this
thread and the slot-overwriting patch. While I'm not yet convinced
that the proposed allowing slot patch is the right approach, suppose
that we do allow slot overwriting somehow, what value would the patch
proposed in this thread add? Would its only benefit be ensuring that
the 'synced' flag is set to false on the primary?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#30