Replication slot is not able to sync up

Started by Suraj Kharage7 months ago40 messages
#1Suraj Kharage
Suraj Kharage
suraj.kharage@enterprisedb.com

Hi,

Noticed below behaviour where replication slot is not able to sync up if
any catalog changes happened after the creation.
Getting below LOG when trying to sync replication slots using
pg_sync_replication_slots() function.
The newly created slot does not appear on the standby after this LOG -

2025-05-23 07:57:12.453 IST [4178805] *LOG: could not synchronize
replication slot "failover_slot" because remote slot precedes local slot*
2025-05-23 07:57:12.453 IST [4178805] *DETAIL: The remote slot has LSN
0/B000060 and catalog xmin 764, but the local slot has LSN 0/B000060 and
catalog xmin 765.*
2025-05-23 07:57:12.453 IST [4178805] STATEMENT: SELECT
pg_sync_replication_slots();

Below is the test case tried on latest master branch -
=========
- Create the Primary and start the server.
wal_level = logical

- Create the physical slot on Primary.
SELECT pg_create_physical_replication_slot('slot1');

- Setup the standby using pg_basebackup.
bin/pg_basebackup -D data1 -p 5418 -d "dbname=postgres" -R

primary_slot_name = 'slot1'
hot_standby_feedback = on
port = 5419

-- Start the standby.

-- Connect to Primary and create a logical replication slot.
SELECT pg_create_logical_replication_slot('failover_slot', 'pgoutput',
false, false, true);

postgres@4177929=#select xmin,* from pg_replication_slots ;
xmin | slot_name | plugin | slot_type | datoid | database |
temporary | active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phas
e | two_phase_at | inactive_since | conflicting |
invalidation_reason | failover | synced
------+---------------+----------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+---------
--+--------------+----------------------------------+-------------+---------------------+----------+--------
765 | slot1 | | physical | | | f
| t | 4177898 | 765 | | 0/B018B00 |
| reserved | | f
| | | |
| f | f
| failover_slot | pgoutput | logical | 5 | postgres | f
| f | | | 764 | 0/B000060 | 0/B000098
| reserved | | f
| | 2025-05-23 07:55:31.277584+05:30 | f |
| t | f
(2 rows)

-- Perform some catalog changes. e.g.:
create table abc(id int);
postgres@4179034=#select xmin from pg_class where relname='abc';
xmin
------
764
(1 row)

-- Connect to the standby and try to sync the replication slots.
SELECT pg_sync_replication_slots();

In the logfile, can see below LOG -
2025-05-23 07:57:12.453 IST [4178805] LOG: could not synchronize
replication slot "failover_slot" because remote slot precedes local slot
2025-05-23 07:57:12.453 IST [4178805] DETAIL: The remote slot has LSN
0/B000060 and catalog xmin 764, but the local slot has LSN 0/B000060 and
catalog xmin 765.
2025-05-23 07:57:12.453 IST [4178805] STATEMENT: SELECT
pg_sync_replication_slots();

select xmin,* from pg_replication_slots ;
no rows

Primary -
postgres@4179034=#select xmin,* from pg_replication_slots ;
xmin | slot_name | plugin | slot_type | datoid | database |
temporary | active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phas
e | two_phase_at | inactive_since | conflicting |
invalidation_reason | failover | synced
------+---------------+----------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+---------
--+--------------+----------------------------------+-------------+---------------------+----------+--------
765 | slot1 | | physical | | | f
| t | 4177898 | 765 | | 0/B018C08 |
| reserved | | f
| | | |
| f | f
| failover_slot | pgoutput | logical | 5 | postgres | f
| f | | | 764 | 0/B000060 | 0/B000098
| reserved | | f
| | 2025-05-23 07:55:31.277584+05:30 | f |
| t | f
(2 rows)
=========

Is there any way to sync up the replication slot after the catalog changes
have been made after creation?
--

Thanks & Regards,
Suraj kharage,

enterprisedb.com <https://www.enterprisedb.com/&gt;

#2Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: Suraj Kharage (#1)
Re: Replication slot is not able to sync up

On Fri, May 23, 2025 at 9:57 AM Suraj Kharage <
suraj.kharage@enterprisedb.com> wrote:

Hi,

Noticed below behaviour where replication slot is not able to sync up if
any catalog changes happened after the creation.
Getting below LOG when trying to sync replication slots using
pg_sync_replication_slots() function.
The newly created slot does not appear on the standby after this LOG -

2025-05-23 07:57:12.453 IST [4178805] *LOG: could not synchronize
replication slot "failover_slot" because remote slot precedes local slot*
2025-05-23 07:57:12.453 IST [4178805] *DETAIL: The remote slot has LSN
0/B000060 and catalog xmin 764, but the local slot has LSN 0/B000060 and
catalog xmin 765.*
2025-05-23 07:57:12.453 IST [4178805] STATEMENT: SELECT
pg_sync_replication_slots();

Below is the test case tried on latest master branch -
=========
- Create the Primary and start the server.
wal_level = logical

- Create the physical slot on Primary.
SELECT pg_create_physical_replication_slot('slot1');

- Setup the standby using pg_basebackup.
bin/pg_basebackup -D data1 -p 5418 -d "dbname=postgres" -R

primary_slot_name = 'slot1'
hot_standby_feedback = on
port = 5419

-- Start the standby.

-- Connect to Primary and create a logical replication slot.
SELECT pg_create_logical_replication_slot('failover_slot', 'pgoutput',
false, false, true);

postgres@4177929=#select xmin,* from pg_replication_slots ;
xmin | slot_name | plugin | slot_type | datoid | database |
temporary | active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phas
e | two_phase_at | inactive_since | conflicting |
invalidation_reason | failover | synced

------+---------------+----------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+---------

--+--------------+----------------------------------+-------------+---------------------+----------+--------
765 | slot1 | | physical | | | f
| t | 4177898 | 765 | | 0/B018B00 |
| reserved | | f
| | | |
| f | f
| failover_slot | pgoutput | logical | 5 | postgres | f
| f | | | 764 | 0/B000060 | 0/B000098
| reserved | | f
| | 2025-05-23 07:55:31.277584+05:30 | f |
| t | f
(2 rows)

-- Perform some catalog changes. e.g.:
create table abc(id int);
postgres@4179034=#select xmin from pg_class where relname='abc';
xmin
------
764
(1 row)

-- Connect to the standby and try to sync the replication slots.
SELECT pg_sync_replication_slots();

In the logfile, can see below LOG -
2025-05-23 07:57:12.453 IST [4178805] LOG: could not synchronize
replication slot "failover_slot" because remote slot precedes local slot
2025-05-23 07:57:12.453 IST [4178805] DETAIL: The remote slot has LSN
0/B000060 and catalog xmin 764, but the local slot has LSN 0/B000060 and
catalog xmin 765.
2025-05-23 07:57:12.453 IST [4178805] STATEMENT: SELECT
pg_sync_replication_slots();

select xmin,* from pg_replication_slots ;
no rows

Primary -
postgres@4179034=#select xmin,* from pg_replication_slots ;
xmin | slot_name | plugin | slot_type | datoid | database |
temporary | active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phas
e | two_phase_at | inactive_since | conflicting |
invalidation_reason | failover | synced

------+---------------+----------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+---------

--+--------------+----------------------------------+-------------+---------------------+----------+--------
765 | slot1 | | physical | | | f
| t | 4177898 | 765 | | 0/B018C08 |
| reserved | | f
| | | |
| f | f
| failover_slot | pgoutput | logical | 5 | postgres | f
| f | | | 764 | 0/B000060 | 0/B000098
| reserved | | f
| | 2025-05-23 07:55:31.277584+05:30 | f |
| t | f
(2 rows)
=========

Is there any way to sync up the replication slot after the catalog changes
have been made after creation?

The remote_slot (slot on primary) should be advanced before you invoke
sync_slot. Can you do pg_logical_slot_get_changes() API before performing
sync? You can check the xmin of the logical slot after get_changes to
ensure that xmin has moved to 765 in your case.

--
With Regards,
Amit Kapila.

#3Robert Haas
Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#2)
Re: Replication slot is not able to sync up

On Fri, May 23, 2025 at 12:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

The remote_slot (slot on primary) should be advanced before you invoke sync_slot. Can you do pg_logical_slot_get_changes() API before performing sync? You can check the xmin of the logical slot after get_changes to ensure that xmin has moved to 765 in your case.

I'm fairly dismayed by this example. I hope I'm misunderstanding
something, because otherwise I have difficulty understanding how we
thought it was OK to ship this feature in this condition.

At the moment that pg_sync_replication_slots() is executed, a slot
named failover_slot exists on only one of the two servers. How can you
justify emitting an error message complaining that "remote slot
precedes local slot"? There's only one slot! I understand that, under
the hood, we probably created an additional slot on the standby and
then tried to fast-forward it, and this error occurred in the second
step. But a user shouldn't have to understand those kinds of internal
implementation details to make sense of the error message. If the
problem is that we're not able to create a slot on the standby at an
old enough LSN or XID position to permit its use with the
corresponding slot on the master, it should be reported that way.

It also seems like having to execute a manual step like
pg_logical_slot_get_changes() in order for things to work is really
missing the point of the feature. I mean, it seems like the intention
of the feature was that someone can just periodically call
pg_sync_replication_slots() on each standby and the right things will
happen -- creating slots or fast-forwarding them or dropping them, as
required. But if that sometimes requires manual fiddling like having
to consume changes from a slot then basically the feature just doesn't
work, because now the user will have to somehow understand when that
is required and what they need to do to fix it. This doesn't even seem
like a particularly obscure case.

To be honest, even after spending quite a bit of time on this, I still
don't really understand what's happening with the xmins here. Just
after creating the logical slot on the primary, it has xmin 764 on one
slot and xmin 765 on the other, and I don't understand why that's the
case, nor why the extra DDL command is needed to trigger the problem.
But I also can't shake the feeling that I shouldn't *need* to
understand that stuff to use the feature. Isn't that the whole point?

--
Robert Haas
EDB: http://www.enterprisedb.com

#4Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#3)
Re: Replication slot is not able to sync up

On Fri, May 23, 2025 at 11:25 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, May 23, 2025 at 12:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

The remote_slot (slot on primary) should be advanced before you invoke sync_slot. Can you do pg_logical_slot_get_changes() API before performing sync? You can check the xmin of the logical slot after get_changes to ensure that xmin has moved to 765 in your case.

I'm fairly dismayed by this example. I hope I'm misunderstanding
something, because otherwise I have difficulty understanding how we
thought it was OK to ship this feature in this condition.

At the moment that pg_sync_replication_slots() is executed, a slot
named failover_slot exists on only one of the two servers. How can you
justify emitting an error message complaining that "remote slot
precedes local slot"? There's only one slot! I understand that, under
the hood, we probably created an additional slot on the standby and
then tried to fast-forward it, and this error occurred in the second
step. But a user shouldn't have to understand those kinds of internal
implementation details to make sense of the error message.

Fair point.

If the

problem is that we're not able to create a slot on the standby at an
old enough LSN or XID position to permit its use with the
corresponding slot on the master, it should be reported that way.

That is the case, and we should improve the LOG message. However, let
me first explain to you what is going on here. This happens because
the DDL is replicated before the pg_sync_replication_slots() call, due
to which the locally created slot on the standby will acquire an xmin
later (765) than the slot on the master (764). So, we can't sync in
that particular sync cycle because otherwise, we can't guarantee the
required rows will be present on the standby later when one tries to
use the slot.

IIUC, the users will use this feature where master (publisher) and
subscriber nodes are doing logical replication, and we want to keep
the corresponding logical slot's copy on the physical standby. So that
if the master goes down, then the subscriber can continue logical
replication from the physical standby. In such a setup, users won't
need to bother with such LOGs because even if we are not able to sync
the logical slot in a particular sync cycle and the LOG appears, we
should be able to sync in the next cycle.

In the case presented here, the logical slot is expected to keep
forwarding, and in the consecutive sync cycle, the sync should be
successful. Users using logical decoding APIs should also be aware
that if due for some reason, the logical slot is not moving forward,
the master/publisher node will start accumulating dead rows and WAL,
which can create bigger problems.

--
With Regards,
Amit Kapila.

#5shveta malik
shveta malik
shveta.malik@gmail.com
In reply to: Amit Kapila (#4)
1 attachment(s)
Re: Replication slot is not able to sync up

On Sat, May 24, 2025 at 10:37 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

If the

problem is that we're not able to create a slot on the standby at an
old enough LSN or XID position to permit its use with the
corresponding slot on the master, it should be reported that way.

That is the case, and we should improve the LOG message.

Agree that log messages need improvement. Please find the patch
attached for the same. I also intend to update the docs in this area
for users to understand this feature better, and will work on that
soon.

thanks
Shveta

Attachments:

v1-0001-Improve-log-messages-in-slotsync.patchapplication/octet-stream; name=v1-0001-Improve-log-messages-in-slotsync.patch
#6shveta malik
shveta malik
shveta.malik@gmail.com
In reply to: shveta malik (#5)
2 attachment(s)
Re: Replication slot is not able to sync up

On Mon, May 26, 2025 at 12:02 PM shveta malik <shveta.malik@gmail.com> wrote:

Agree that log messages need improvement. Please find the patch
attached for the same. I also intend to update the docs in this area
for users to understand this feature better, and will work on that
soon.

PFA the patch with doc changes as well. The doc explains the need of
pg_logical_slot_get_changes() for a particular scenario.

Also attached the script to show how this setup works. When the
replication slot is being actively consumed on primary, we do not
observe that particular LOG (could not synchronize replication) on
standby and synchronization proceeds without any manual intervention.

Thanks Nisha for the script.

thanks
Shveta

Attachments:

v2-0001-Improve-log-messages-and-docs-for-slotsync.patchapplication/octet-stream; name=v2-0001-Improve-log-messages-and-docs-for-slotsync.patch
test_slotsync.shtext/x-sh; charset=US-ASCII; name=test_slotsync.sh
#7Masahiko Sawada
Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#4)
Re: Replication slot is not able to sync up

On Fri, May 23, 2025 at 10:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

In the case presented here, the logical slot is expected to keep
forwarding, and in the consecutive sync cycle, the sync should be
successful. Users using logical decoding APIs should also be aware
that if due for some reason, the logical slot is not moving forward,
the master/publisher node will start accumulating dead rows and WAL,
which can create bigger problems.

I've tried this case and am concerned that the slot synchronization
using pg_sync_replication_slots() would never succeed while the
primary keeps getting write transactions. Even if the user manually
consumes changes on the primary, the primary server keeps advancing
its XID in the meanwhile. On the standby, we ensure that the
TransamVariables->nextXid is beyond the XID of WAL record that it's
going to apply so the xmin horizon calculated by
GetOldestSafeDecodingTransactionId() ends up always being higher than
the slot's catalog_xmin on the primary. We get the log message "could
not synchronize replication slot "s" because remote slot precedes
local slot" and cleanup the slot on the standby at the end of
pg_sync_replication_slots().

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#8Zhijie Hou (Fujitsu)
Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Masahiko Sawada (#7)
RE: Replication slot is not able to sync up

On Wed, May 28, 2025 at 2:09 AM Masahiko Sawada wrote:

On Fri, May 23, 2025 at 10:07 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

In the case presented here, the logical slot is expected to keep
forwarding, and in the consecutive sync cycle, the sync should be
successful. Users using logical decoding APIs should also be aware
that if due for some reason, the logical slot is not moving forward,
the master/publisher node will start accumulating dead rows and WAL,
which can create bigger problems.

I've tried this case and am concerned that the slot synchronization using
pg_sync_replication_slots() would never succeed while the primary keeps
getting write transactions. Even if the user manually consumes changes on the
primary, the primary server keeps advancing its XID in the meanwhile. On the
standby, we ensure that the
TransamVariables->nextXid is beyond the XID of WAL record that it's
going to apply so the xmin horizon calculated by
GetOldestSafeDecodingTransactionId() ends up always being higher than the
slot's catalog_xmin on the primary. We get the log message "could not
synchronize replication slot "s" because remote slot precedes local slot" and
cleanup the slot on the standby at the end of pg_sync_replication_slots().

I think the issue occurs because unlike the slotsync worker, the SQL API
removes temporary slots when the function ends, so it cannot hold back the
standby's catalog_xmin. If transactions on the primary keep advancing xids, the
source slot's catalog_xmin on the primary fails to catch up with the standby's
nextXid, causing sync failure.

We chose this behavior because we could not predict when (or if) the SQL
function might be executed again, and the creating session might persist after
promotion. Without automatic cleanup, this could lead to temporary slots being
retained for a longer time.

This only affects the initial sync when creating a new slot on the standby.
Once the slot exists, the standby's catalog_xmin stabilizes, preventing the
issue in subsequent syncs.

I think the SQL API was mainly intended for testing and debugging purposes
where controlled sync operations are useful. For production use, the slotsync
worker (with sync_replication_slots=on) is recommended because it automatically
handles this problem and requires minimal manual intervention. But to avoid
confusion, I think we should clearly document this distinction.

Best Regards,
Hou zj

#9Masahiko Sawada
Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#8)
Re: Replication slot is not able to sync up

On Tue, May 27, 2025 at 9:15 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:

On Wed, May 28, 2025 at 2:09 AM Masahiko Sawada wrote:

On Fri, May 23, 2025 at 10:07 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

In the case presented here, the logical slot is expected to keep
forwarding, and in the consecutive sync cycle, the sync should be
successful. Users using logical decoding APIs should also be aware
that if due for some reason, the logical slot is not moving forward,
the master/publisher node will start accumulating dead rows and WAL,
which can create bigger problems.

I've tried this case and am concerned that the slot synchronization using
pg_sync_replication_slots() would never succeed while the primary keeps
getting write transactions. Even if the user manually consumes changes on the
primary, the primary server keeps advancing its XID in the meanwhile. On the
standby, we ensure that the
TransamVariables->nextXid is beyond the XID of WAL record that it's
going to apply so the xmin horizon calculated by
GetOldestSafeDecodingTransactionId() ends up always being higher than the
slot's catalog_xmin on the primary. We get the log message "could not
synchronize replication slot "s" because remote slot precedes local slot" and
cleanup the slot on the standby at the end of pg_sync_replication_slots().

I think the issue occurs because unlike the slotsync worker, the SQL API
removes temporary slots when the function ends, so it cannot hold back the
standby's catalog_xmin. If transactions on the primary keep advancing xids, the
source slot's catalog_xmin on the primary fails to catch up with the standby's
nextXid, causing sync failure.

Agreed with this analysis.

This only affects the initial sync when creating a new slot on the standby.
Once the slot exists, the standby's catalog_xmin stabilizes, preventing the
issue in subsequent syncs.

Right. I think this is an area where we can improve, if there is a
real use case.

I think the SQL API was mainly intended for testing and debugging purposes
where controlled sync operations are useful. For production use, the slotsync
worker (with sync_replication_slots=on) is recommended because it automatically
handles this problem and requires minimal manual intervention. But to avoid
confusion, I think we should clearly document this distinction.

I didn't know it was intended for testing and debugging purposes so
clearilying it in the documentation would be a good idea. Also, I
agree that using the slotsync worker is the primary usage of this
feature. I'm interested in whether there is a use case where the SQL
API is more preferable. If there is, we can improve the SQL API part,
especially the first synchronization part, for v19 or later.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#10shveta malik
shveta malik
shveta.malik@gmail.com
In reply to: Masahiko Sawada (#9)
1 attachment(s)
Re: Replication slot is not able to sync up

On Wed, May 28, 2025 at 11:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I didn't know it was intended for testing and debugging purposes so
clearilying it in the documentation would be a good idea.

I have added the suggested docs in v3.

thanks
Shveta

Attachments:

v3-0001-Improve-log-messages-and-docs-for-slotsync.patchapplication/octet-stream; name=v3-0001-Improve-log-messages-and-docs-for-slotsync.patch
#11Robert Haas
Robert Haas
robertmhaas@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#8)
Re: Replication slot is not able to sync up

On Wed, May 28, 2025 at 12:15 AM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:

I think the SQL API was mainly intended for testing and debugging purposes
where controlled sync operations are useful. For production use, the slotsync
worker (with sync_replication_slots=on) is recommended because it automatically
handles this problem and requires minimal manual intervention. But to avoid
confusion, I think we should clearly document this distinction.

If this analysis is correct, this should never have been committed, at
least not in this form. When we ship something, it needs to work.
Testing and debugging facilities are best placed in src/test/modules
or in contrib; if for some reason they really need to be in
src/backend, then they had better be clearly documented as such.

What really annoys me about this is that the function gives every
superficial impression of being something you could actually use. Why
wouldn't a user believe that if they periodically connect and run
pg_sync_replication_slots(), things will be OK? I can certainly
imagine a user *wanting* that to work. I'd like that to work. But it
seems like either it's impossible for some reason that isn't clear to
me, and we just went ahead and shipped it in a non-working state
anyway, or it is possible to make it work and we didn't do the
necessary engineering before something got committed. Either way,
that's really disappointing.

I think the issue occurs because unlike the slotsync worker, the SQL API
removes temporary slots when the function ends, so it cannot hold back the
standby's catalog_xmin. If transactions on the primary keep advancing xids, the
source slot's catalog_xmin on the primary fails to catch up with the standby's
nextXid, causing sync failure.

I still don't understand how this problem arises in the first place.
It seems like you're describing a situation where we need to prevent
the standby from getting ahead of the primary, but that should be
impossible by definition.

--
Robert Haas
EDB: http://www.enterprisedb.com

#12Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#11)
Re: Replication slot is not able to sync up

On Thu, May 29, 2025 at 6:01 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, May 28, 2025 at 12:15 AM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:

I think the SQL API was mainly intended for testing and debugging purposes
where controlled sync operations are useful. For production use, the slotsync
worker (with sync_replication_slots=on) is recommended because it automatically
handles this problem and requires minimal manual intervention. But to avoid
confusion, I think we should clearly document this distinction.

If this analysis is correct, this should never have been committed, at
least not in this form. When we ship something, it needs to work.
Testing and debugging facilities are best placed in src/test/modules
or in contrib; if for some reason they really need to be in
src/backend, then they had better be clearly documented as such.

What really annoys me about this is that the function gives every
superficial impression of being something you could actually use. Why
wouldn't a user believe that if they periodically connect and run
pg_sync_replication_slots(), things will be OK? I can certainly
imagine a user *wanting* that to work. I'd like that to work. But it
seems like either it's impossible for some reason that isn't clear to
me, and we just went ahead and shipped it in a non-working state
anyway, or it is possible to make it work and we didn't do the
necessary engineering before something got committed. Either way,
that's really disappointing.

I think the issue occurs because unlike the slotsync worker, the SQL API
removes temporary slots when the function ends, so it cannot hold back the
standby's catalog_xmin. If transactions on the primary keep advancing xids, the
source slot's catalog_xmin on the primary fails to catch up with the standby's
nextXid, causing sync failure.

I still don't understand how this problem arises in the first place.
It seems like you're describing a situation where we need to prevent
the standby from getting ahead of the primary, but that should be
impossible by definition.

The reason is that we do not allow creating a synced slot if the
required WAL or catalog rows for this slot have been removed or are at
risk of removal. The way we achieve it is that during the first
sync_slot call, either via slotsync worker or API, we create a
temporary slot on the standby with xmin pointed to the safest possible
xmin (catalog_xmin) on standby computed by
GetOldestSafeDecodingTransactionId() and WAL (restart_lsn) pointed to
by the oldest WAL present on standby. Now, if the source slot's (slot
on primary) corresponding location/xmin are prior to the location/xmin
on the standby then we can't sync the slot immediately because there
is no guarantee that required resources (WAL/catalog_rows) will be
available when we try to use the synced slot after promotion. The
slotsync worker will keep retrying to sync the slot and will
eventually succeed once the source slot's values are safe to be synced
to the standby. Now, with API, we didn't implement this retry logic
due to which we see the behaviour currently reported. Note that once
the first time sync is successful, the consecutive times, even the
API, should work similar to the worker.

I agree that the current use of API is limited, such that one can use
it in a controlled environment (e.g., the first time sync happens
before other operations on primary), or to debug this functionality,
or to write tests. It is not clear to me why someone would not use the
built-in functionality to sync slots and prefer this API. But going
forward (as we see people would like to use this API to sync slots),
it is not that difficult to improve this API to match its behaviour
with the built-in worker for initial/first sync.

I see that we separately document functions [1]https://www.postgresql.org/docs/current/functions-textsearch.html#TEXTSEARCH-FUNCTIONS-DEBUG-TABLE used for
development/debug, and this API could be documented in that way.

[1]: https://www.postgresql.org/docs/current/functions-textsearch.html#TEXTSEARCH-FUNCTIONS-DEBUG-TABLE

--
With Regards,
Amit Kapila.

#13Zhijie Hou (Fujitsu)
Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Masahiko Sawada (#7)
1 attachment(s)
RE: Replication slot is not able to sync up

On Wed, May 28, 2025 at 2:09 AM Masahiko Sawada wrote:

On Fri, May 23, 2025 at 10:07 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

In the case presented here, the logical slot is expected to keep
forwarding, and in the consecutive sync cycle, the sync should be
successful. Users using logical decoding APIs should also be aware
that if due for some reason, the logical slot is not moving forward,
the master/publisher node will start accumulating dead rows and WAL,
which can create bigger problems.

I've tried this case and am concerned that the slot synchronization using
pg_sync_replication_slots() would never succeed while the primary keeps
getting write transactions. Even if the user manually consumes changes on the
primary, the primary server keeps advancing its XID in the meanwhile. On the
standby, we ensure that the
TransamVariables->nextXid is beyond the XID of WAL record that it's
going to apply so the xmin horizon calculated by
GetOldestSafeDecodingTransactionId() ends up always being higher than the
slot's catalog_xmin on the primary. We get the log message "could not
synchronize replication slot "s" because remote slot precedes local slot" and
cleanup the slot on the standby at the end of pg_sync_replication_slots().

To improve this workload scenario, we can modify pg_sync_replication_slots() to
wait for the primary slot to advance to a suitable position before completing
synchronization and removing the temporary slot. This would allow the sync to
complete as soon as the primary slot advances, whether through
pg_logical_xx_get_changes() or other ways.

I've created a POC (attached) that currently waits indefinitely for the remote
slot to catch up. We could later add a timeout parameter to control maximum
wait time if this approach seems acceptable.

I tested that, when pgbench TPC-B is running on the primary, calling
pg_sync_replication_slots() on the standby correctly blocks until I advance the
primary slot position by calling pg_logical_xx_get_changes().

if the basic idea sounds reasonable then I can start a separate
thread to extend this API. Thoughts ?

Best Regards,
Hou zj

Attachments:

0001-POC-Improve-initial-slot-synchronization-in-pg_sync_repl.patchapplication/octet-stream; name=0001-POC-Improve-initial-slot-synchronization-in-pg_sync_repl.patch
#14Amul Sul
Amul Sul
sulamul@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#13)
Re: Replication slot is not able to sync up

On Fri, May 30, 2025 at 3:38 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:

On Wed, May 28, 2025 at 2:09 AM Masahiko Sawada wrote:

On Fri, May 23, 2025 at 10:07 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

In the case presented here, the logical slot is expected to keep
forwarding, and in the consecutive sync cycle, the sync should be
successful. Users using logical decoding APIs should also be aware
that if due for some reason, the logical slot is not moving forward,
the master/publisher node will start accumulating dead rows and WAL,
which can create bigger problems.

I've tried this case and am concerned that the slot synchronization using
pg_sync_replication_slots() would never succeed while the primary keeps
getting write transactions. Even if the user manually consumes changes on the
primary, the primary server keeps advancing its XID in the meanwhile. On the
standby, we ensure that the
TransamVariables->nextXid is beyond the XID of WAL record that it's
going to apply so the xmin horizon calculated by
GetOldestSafeDecodingTransactionId() ends up always being higher than the
slot's catalog_xmin on the primary. We get the log message "could not
synchronize replication slot "s" because remote slot precedes local slot" and
cleanup the slot on the standby at the end of pg_sync_replication_slots().

To improve this workload scenario, we can modify pg_sync_replication_slots() to
wait for the primary slot to advance to a suitable position before completing
synchronization and removing the temporary slot. This would allow the sync to
complete as soon as the primary slot advances, whether through
pg_logical_xx_get_changes() or other ways.

I've created a POC (attached) that currently waits indefinitely for the remote
slot to catch up. We could later add a timeout parameter to control maximum
wait time if this approach seems acceptable.

Quick question -- due to my limited understanding of this area: why
can't we perform an action similar to pg_logical_slot_get_changes()
implicitly from pg_sync_replication_slots()? Would there be any
implications of doing so?

Regards,
Amul

#15Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: Amul Sul (#14)
Re: Replication slot is not able to sync up

On Fri, May 30, 2025 at 4:05 PM Amul Sul <sulamul@gmail.com> wrote:

Quick question -- due to my limited understanding of this area: why
can't we perform an action similar to pg_logical_slot_get_changes()
implicitly from pg_sync_replication_slots()? Would there be any
implications of doing so?

Yes, there would be implications if we did it that way. It would mean
that the consumer of the slot may not process those changes (for which
sync_slot API has done the get_changes) and send it to the client.
Consider a publisher-subscriber and physical standby setup. In this
setup, the subscriber creates a logical slot corresponding to the
subscription on the publisher. Now, the publisher process changes and
sends it to the subscriber; then the slot is advanced (both its xmin
and WAL locations) once the corresponding changes are sent to the
client.

If we allow pg_sync_replication_slots() to do
pg_logical_slot_get_changes or equivalent in some way, then we may end
up advancing the slot without sending the changes to the subscriber,
which would be considered a data loss for the subscriber.

I have explained in terms of built-in logical replication, but the
external plugins using these APIs (pg_logical_*) should be doing
something similar to process the changes and advance the slot.

Does this answer your question and make sense to you?

--
With Regards,
Amit Kapila.

#16Amul Sul
Amul Sul
sulamul@gmail.com
In reply to: Amit Kapila (#15)
Re: Replication slot is not able to sync up

On Fri, May 30, 2025 at 4:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, May 30, 2025 at 4:05 PM Amul Sul <sulamul@gmail.com> wrote:

Quick question -- due to my limited understanding of this area: why
can't we perform an action similar to pg_logical_slot_get_changes()
implicitly from pg_sync_replication_slots()? Would there be any
implications of doing so?

Yes, there would be implications if we did it that way. It would mean
that the consumer of the slot may not process those changes (for which
sync_slot API has done the get_changes) and send it to the client.
Consider a publisher-subscriber and physical standby setup. In this
setup, the subscriber creates a logical slot corresponding to the
subscription on the publisher. Now, the publisher process changes and
sends it to the subscriber; then the slot is advanced (both its xmin
and WAL locations) once the corresponding changes are sent to the
client.

If we allow pg_sync_replication_slots() to do
pg_logical_slot_get_changes or equivalent in some way, then we may end
up advancing the slot without sending the changes to the subscriber,
which would be considered a data loss for the subscriber.

I have explained in terms of built-in logical replication, but the
external plugins using these APIs (pg_logical_*) should be doing
something similar to process the changes and advance the slot.

Does this answer your question and make sense to you?

Yes, understood. Thank you!

Regards,
Amul

#17Robert Haas
Robert Haas
robertmhaas@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#13)
Re: Replication slot is not able to sync up

On Fri, May 30, 2025 at 6:08 AM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:

To improve this workload scenario, we can modify pg_sync_replication_slots() to
wait for the primary slot to advance to a suitable position before completing
synchronization and removing the temporary slot. This would allow the sync to
complete as soon as the primary slot advances, whether through
pg_logical_xx_get_changes() or other ways.

My understanding of this area is limited, but this sounds potentially
promising to me. The current approach seems very timing-dependent.
Depending on the state of the primary vs. the state of the standby, a
call to pg_sync_replication_slots() may either create a slot or fail
to do so. A call at a slightly earlier or later time might have had a
different result. IIUC, this proposal would make different results due
to minor timing variations less probable.

--
Robert Haas
EDB: http://www.enterprisedb.com

#18Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: shveta malik (#10)
Re: Replication slot is not able to sync up

On Thu, May 29, 2025 at 8:39 AM shveta malik <shveta.malik@gmail.com> wrote:

On Wed, May 28, 2025 at 11:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I didn't know it was intended for testing and debugging purposes so
clearilying it in the documentation would be a good idea.

I have added the suggested docs in v3.

- errmsg("could not synchronize replication slot \"%s\"", remote_slot->name),
- errdetail("Logical decoding could not find consistent point from
local slot's LSN %X/%X.",
+ errmsg("could not synchronize replication slot \"%s\" to prevent
data loss", remote_slot->name),
+ errdetail("Standby does not have enough data to decode WALs at LSN %X/%X.",
    LSN_FORMAT_ARGS(slot->data.restart_lsn)));

I find the errdetail is not clear about the current state, which is
that we can't yet build a consistent snapshot on the standby to allow
decoding. Would it be better to have errdetail like: "Standby could
not build a consistent snapshot to decode WALs at LSN %X/%X.?

--
With Regards,
Amit Kapila.

#19Zhijie Hou (Fujitsu)
Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: shveta malik (#10)
RE: Replication slot is not able to sync up

On Thu, May 29, 2025 at 11:09 AM shveta malik wrote:

On Wed, May 28, 2025 at 11:56 AM Masahiko Sawada
<sawada.mshk@gmail.com> wrote:

I didn't know it was intended for testing and debugging purposes so
clearilying it in the documentation would be a good idea.

I have added the suggested docs in v3.

Thanks for updating the patch.

I have few suggestions for the document from a user's perspective.

1.

... , one
condition must be met. The logical replication slot on primary must be advanced
to such a catalog change position (catalog_xmin) and WAL's LSN (restart_lsn) for
which sufficient data is retained on the corresponding standby server.

The term "catalog change position" might be not be very eaiser for some readers
to grasp. Would it be clearer to phrase it as follows ?

"The logical replication slot on the primary must reach a state where the WALs
and system catalog rows retained by the slot are also present on the
corresponding standby server. "

2.

If the primary slot is still lagging behind and synchronization is attempted
for the first time, then to prevent the data loss as explained, persistence
and synchronization of newly created slot will be skipped, and the following
log message may appear on standby.

The phrase "lagging behind" typically refers to the standby, which can be a bit
confusing. I understand that user can context around to understand it, but
would it be eaiser to undertand by providing a more detailed description like
below ?

"If the WALs and system catalog rows retained by the slot on the primary have
already been purged from the standby server, ..."

3.
<programlisting>
LOG: could not synchronize replication slot "failover_slot" to prevent data loss
DETAIL: The remote slot needs WAL at LSN 0/3003F28 and catalog xmin 754, but the standby has LSN 0/3003F28 and catalog xmin 766.
</programlisting>

It seems that it lacks one space between "LOG:" and the message

Best Regards,
Hou zj

#20shveta malik
shveta malik
shveta.malik@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#19)
1 attachment(s)
Re: Replication slot is not able to sync up

On Tue, Jun 10, 2025 at 3:20 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:

Thanks for updating the patch.

I have few suggestions for the document from a user's perspective.

Thanks Hou-San, I agree with your suggestions. Addressed in v4.

Also addressed Amit's suggestion at [1]/messages/by-id/CAA4eK1JKXCMDqfFgNtemVZ9ge4KrQtwSQG1OwMLNHRBDfnH9rA@mail.gmail.com to improve errdetail.

[1]: /messages/by-id/CAA4eK1JKXCMDqfFgNtemVZ9ge4KrQtwSQG1OwMLNHRBDfnH9rA@mail.gmail.com

thanks
Shveta

Attachments:

v4-0001-Improve-log-messages-and-docs-for-slotsync.patchapplication/octet-stream; name=v4-0001-Improve-log-messages-and-docs-for-slotsync.patch
#21Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: shveta malik (#20)
Re: Replication slot is not able to sync up

On Wed, Jun 11, 2025 at 7:19 AM shveta malik <shveta.malik@gmail.com> wrote:

On Tue, Jun 10, 2025 at 3:20 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:

Thanks for updating the patch.

I have few suggestions for the document from a user's perspective.

Thanks Hou-San, I agree with your suggestions. Addressed in v4.

Also addressed Amit's suggestion at [1] to improve errdetail.

So, the overall direction we are taking here is that we want to
improve the existing LOG/DEBUG messages and docs for HEAD and back
branches. Then we will improve the API behavior based on Hou-San's
patch for PG19. Let me know if you or others think otherwise.

+    <para>
+     Apart from enabling <link linkend="guc-sync-replication-slots">
+     <varname>sync_replication_slots</varname></link> to synchronize slots
+     periodically, failover slots can be manually synchronized by invoking
+     <link linkend="pg-sync-replication-slots">
+     <function>pg_sync_replication_slots</function></link> on the standby.
+     However, this function is primarily intended for testing and debugging
+     purposes and should be used with caution. The recommended approach to
+     synchronize slots is by enabling <link
linkend="guc-sync-replication-slots">
+     <varname>sync_replication_slots</varname></link> on the standby, as it
+     ensures continuous and automatic synchronization of replication slots,
+     facilitating seamless failover and high availability.
+    </para>
+
+    <para>
+     When slot-synchronization setup is done as recommended, and
+     slot-synchronization is performed the very first time either automatically
+     or by <link linkend="pg-sync-replication-slots">
+     <function>pg_sync_replication_slots</function></link>,
+     then for the synchronized slot to be created and persisted on the standby,
+     one condition must be met. The logical replication slot on the primary
+     must reach a state where the WALs and system catalog rows retained by
+     the slot are also present on the corresponding standby server. This is
+     needed to prevent any data loss and to allow logical replication
to continue
+     seamlessly through the synchronized slot if needed after promotion.
+     If the WALs and system catalog rows retained by the slot on the
primary have
+     already been purged from the standby server, and synchronization
is attempted
+     for the first time, then to prevent the data loss as explained,
persistence
+     and synchronization of newly created slot will be skipped, and
the following
+     log message may appear on standby.
+<programlisting>
+     LOG:  could not synchronize replication slot "failover_slot"
+     DETAIL:  Synchronization could lead to data loss as the remote
slot needs WAL at LSN 0/3003F28 and catalog xmin 754, but the standby
has LSN 0/3003F28 and catalog xmin 756
+</programlisting>
+     If the logical replication slot is actively consumed by a
consumer, no further
+     manual action is needed by the user, as the slot on primary will
be advanced
+     automatically, and synchronization will proceed in the next
cycle. However,
+     if no logical replication consumer is set up yet, to advance the slot, it
+     is recommended to manually run the <link
linkend="pg-logical-slot-get-changes">
+     <function>pg_logical_slot_get_changes</function></link> or
+     <link linkend="pg-logical-slot-get-binary-changes">
+     <function>pg_logical_slot_get_binary_changes</function></link>
on the primary
+     slot and allow synchronization to proceed.
+    </para>
+

I have reworded the above as follows:
To enable periodic synchronization of replication slots, it is
recommended to activate sync_replication_slots on the standby server.
While manual synchronization is possible using
pg_sync_replication_slots, this function is primarily intended for
testing and debugging and should be used with caution. Automatic
synchronization via sync_replication_slots ensures continuous slot
updates, supporting seamless failover and maintaining high
availability. When slot synchronization is configured as recommended,
and the initial synchronization is performed either automatically or
manually via pg_sync_replication_slot, the standby can persist the
synchronized slot only if the following condition is met: The logical
replication slot on the primary must retain WALs and system catalog
rows that are still available on the standby. This ensures data
integrity and allows logical replication to continue smoothly after
promotion.
If the required WALs or catalog rows have already been purged from the
standby, the slot will not be persisted to avoid data loss. In such
cases, the following log message may appear:

LOG: could not synchronize replication slot "failover_slot"
DETAIL: Synchronization could lead to data loss as the remote slot
needs WAL at LSN 0/3003F28 and catalog xmin 754, but the standby has
LSN 0/3003F28 and catalog xmin 756

If the logical replication slot is actively used by a consumer, no
manual intervention is needed; the slot will advance automatically,
and synchronization will resume in the next cycle. However, if no
consumer is configured, it is advisable to manually advance the slot
on the primary using pg_logical_slot_get_changes or
pg_logical_slot_get_binary_changes, allowing synchronization to
proceed.

Let me know what you think of above?

--
With Regards,
Amit Kapila.

#22Peter Smith
Peter Smith
smithpb2250@gmail.com
In reply to: Amit Kapila (#21)
Re: Replication slot is not able to sync up

On Wed, Jun 11, 2025 at 8:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jun 11, 2025 at 7:19 AM shveta malik <shveta.malik@gmail.com> wrote:

On Tue, Jun 10, 2025 at 3:20 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:

Thanks for updating the patch.

I have few suggestions for the document from a user's perspective.

Thanks Hou-San, I agree with your suggestions. Addressed in v4.

Also addressed Amit's suggestion at [1] to improve errdetail.

So, the overall direction we are taking here is that we want to
improve the existing LOG/DEBUG messages and docs for HEAD and back
branches. Then we will improve the API behavior based on Hou-San's
patch for PG19. Let me know if you or others think otherwise.

+    <para>
+     Apart from enabling <link linkend="guc-sync-replication-slots">
+     <varname>sync_replication_slots</varname></link> to synchronize slots
+     periodically, failover slots can be manually synchronized by invoking
+     <link linkend="pg-sync-replication-slots">
+     <function>pg_sync_replication_slots</function></link> on the standby.
+     However, this function is primarily intended for testing and debugging
+     purposes and should be used with caution. The recommended approach to
+     synchronize slots is by enabling <link
linkend="guc-sync-replication-slots">
+     <varname>sync_replication_slots</varname></link> on the standby, as it
+     ensures continuous and automatic synchronization of replication slots,
+     facilitating seamless failover and high availability.
+    </para>
+
+    <para>
+     When slot-synchronization setup is done as recommended, and
+     slot-synchronization is performed the very first time either automatically
+     or by <link linkend="pg-sync-replication-slots">
+     <function>pg_sync_replication_slots</function></link>,
+     then for the synchronized slot to be created and persisted on the standby,
+     one condition must be met. The logical replication slot on the primary
+     must reach a state where the WALs and system catalog rows retained by
+     the slot are also present on the corresponding standby server. This is
+     needed to prevent any data loss and to allow logical replication
to continue
+     seamlessly through the synchronized slot if needed after promotion.
+     If the WALs and system catalog rows retained by the slot on the
primary have
+     already been purged from the standby server, and synchronization
is attempted
+     for the first time, then to prevent the data loss as explained,
persistence
+     and synchronization of newly created slot will be skipped, and
the following
+     log message may appear on standby.
+<programlisting>
+     LOG:  could not synchronize replication slot "failover_slot"
+     DETAIL:  Synchronization could lead to data loss as the remote
slot needs WAL at LSN 0/3003F28 and catalog xmin 754, but the standby
has LSN 0/3003F28 and catalog xmin 756
+</programlisting>
+     If the logical replication slot is actively consumed by a
consumer, no further
+     manual action is needed by the user, as the slot on primary will
be advanced
+     automatically, and synchronization will proceed in the next
cycle. However,
+     if no logical replication consumer is set up yet, to advance the slot, it
+     is recommended to manually run the <link
linkend="pg-logical-slot-get-changes">
+     <function>pg_logical_slot_get_changes</function></link> or
+     <link linkend="pg-logical-slot-get-binary-changes">
+     <function>pg_logical_slot_get_binary_changes</function></link>
on the primary
+     slot and allow synchronization to proceed.
+    </para>
+

I have reworded the above as follows:
To enable periodic synchronization of replication slots, it is
recommended to activate sync_replication_slots on the standby server.
While manual synchronization is possible using
pg_sync_replication_slots, this function is primarily intended for
testing and debugging and should be used with caution. Automatic
synchronization via sync_replication_slots ensures continuous slot
updates, supporting seamless failover and maintaining high
availability. When slot synchronization is configured as recommended,
and the initial synchronization is performed either automatically or
manually via pg_sync_replication_slot, the standby can persist the
synchronized slot only if the following condition is met: The logical
replication slot on the primary must retain WALs and system catalog
rows that are still available on the standby. This ensures data
integrity and allows logical replication to continue smoothly after
promotion.
If the required WALs or catalog rows have already been purged from the
standby, the slot will not be persisted to avoid data loss. In such
cases, the following log message may appear:

LOG: could not synchronize replication slot "failover_slot"
DETAIL: Synchronization could lead to data loss as the remote slot
needs WAL at LSN 0/3003F28 and catalog xmin 754, but the standby has
LSN 0/3003F28 and catalog xmin 756

If the logical replication slot is actively used by a consumer, no
manual intervention is needed; the slot will advance automatically,
and synchronization will resume in the next cycle. However, if no
consumer is configured, it is advisable to manually advance the slot
on the primary using pg_logical_slot_get_changes or
pg_logical_slot_get_binary_changes, allowing synchronization to
proceed.

Let me know what you think of above?

Phrases like "... it is recommended..." and "... intended for testing
and debugging .. " and "... should be used with caution." and "... it
is advisable to..." seem like indicators that parts of the above
description should be using SGML markup such as <caution> or <warning>
or <note> instead of just plain text.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#23shveta malik
shveta malik
shveta.malik@gmail.com
In reply to: Peter Smith (#22)
Re: Replication slot is not able to sync up

On Thu, Jun 12, 2025 at 4:13 AM Peter Smith <smithpb2250@gmail.com> wrote:

Phrases like "... it is recommended..." and "... intended for testing
and debugging .. " and "... should be used with caution." and "... it
is advisable to..." seem like indicators that parts of the above
description should be using SGML markup such as <caution> or <warning>
or <note> instead of just plain text.

I feel WARNING and CAUTION markups could be a little strong for the
concerned case. Such markups are generally used when there is a
side-effect involved with the usage. But in our case, there is no such
side-effect with the API. At max it may fail without harming the
system and will succeed in the next invocation. But I also feel that
such sections catch user attention. Thus if needed, we can have a NOTE
section to convey the recommended way of slot synchronization.
Thoughts?

Similar to our case, I see some other docs using caution words without
a CAUTION markup. Please search for 'caution' in [1]https://www.postgresql.org/docs/current/continuous-archiving.html,[2]https://www.postgresql.org/docs/current/sql-altertable.html,[3]https://www.postgresql.org/docs/18/oauth-validator-design.html#OAUTH-VALIDATOR-DESIGN-USERMAP-DELEGATION

[1]: https://www.postgresql.org/docs/current/continuous-archiving.html
[2]: https://www.postgresql.org/docs/current/sql-altertable.html
[3]: https://www.postgresql.org/docs/18/oauth-validator-design.html#OAUTH-VALIDATOR-DESIGN-USERMAP-DELEGATION

thanks
Shveta

#24Perumal Raj
Perumal Raj
perucinci@gmail.com
In reply to: Suraj Kharage (#1)
Logical Replication slot disappeared after promote Standby

Hi Community,

I have installed postgres version 17.5 with following setup,

*Primary *
-- Secondary A
-- Secondary B
-- Secondary C

*Config:*
wal_level = 'logical'
max_wal_senders = '10'
max_replication_slots = '10'
wal_keep_size = '512MB'
hot_standby = 'on'
sync_replication_slots = 'on'
hot_standby_feedback = 'on'
synchronized_standby_slots = 'Kafka_logical_slot'

1. slotsync worker is running all the time ( Automatic sync)
2. When I create logical replication slot(Kafka_logical_slot) in Primary,
it got synced on both Secondary A and Secondary B
3. It didn't appear in Secondary C , Since its not direct replica.

*Issue : *
When I stop Primary node and promote one of the Direct secondary (A,B)
node. logical replication slot is vanished.

Am I missing any configuration ?

Please share your experience.

Thanks,

#25Zhijie Hou (Fujitsu)
Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Perumal Raj (#24)
RE: Logical Replication slot disappeared after promote Standby

On Thu, Jun 12, 2025 at 4:08 PM Perumal Raj wrote:

Hi Community,

I have installed postgres version 17.5 with following setup,

Primary
-- Secondary A
-- Secondary B
-- Secondary C

Config:
wal_level = 'logical'
max_wal_senders = '10'
max_replication_slots = '10'
wal_keep_size = '512MB'
hot_standby = 'on'
sync_replication_slots = 'on'
hot_standby_feedback = 'on'
synchronized_standby_slots = 'Kafka_logical_slot'

1. slotsync worker is running all the time ( Automatic sync)
2. When I create logical replication slot(Kafka_logical_slot) in Primary, it
got synced on both Secondary A and > Secondary B
3. It didn't appear in Secondary C , Since its not direct replica.

Issue : When I stop Primary node and promote one of the Direct secondary
(A,B) node. logical replication slot is vanished.

Am I missing any configuration ?

Please share your experience.

Thanks for reporting.

To narrow down potential causes, please confirm the following:

1) One possibility is that the slot has not been successfully synchronized to
the standby. To verify, check for the presence of the following log message:

LOG: newly created replication slot "your_slot" is sync-ready now

If this message is absent, it indicates that the slot has not been successfully
synced. Additionally, you can confirm the sync status by inspecting the
pg_replication_slots.temporary field on the standby; a value of true suggests
that the slot sync has not completed.

2) We typically recommend specifying the primary_slot_name on the standby to
prevent slot invalidation due to catalog row removal on the primary. Please
check your logs for possible invalidation messages:

LOG: invalidating obsolete replication slot "your_slot"
or
LOG: terminating process 12344 to release replication slot "your_slot"

3) Is there a chance that the slot was dropped on the primary before stopping
it and promoting the standby? If so, the synced slot would also be dropped
in this scenario.

Best Regards,
Hou zj

#26Perumal Raj
Perumal Raj
perucinci@gmail.com
In reply to: Suraj Kharage (#1)
Re: Logical Replication slot disappeared after promote Standby

Hi Hou zj

I have found some strange issue , but not sure if I am doing anything wrong.

*I am able to see logical slot at STANDBY even after promote. 👏*

Importantly Logical replication slot is persistance in STANDBYs which
already established connection with Primary before logical replication slot
creation.

But If I create any new replica(Direct to Primary) after logical
replication slot creation, then its not persistance*(temporary=true) .*

*New Replica : *
node | slot_name | slot_type | temporary | active | plugin
| database | failover | synced | restart_lsn | confirmed_flush_lsn |
inactive_since
----------+--------------------+-----------+-----------+--------+----------+---------------+----------+--------+-------------+---------------------+------------------------------
stand-by | kafka_logical_slot | logical | t | t | pgoutput
| replica_test | t | t | 0/6C000000 | |
2025-06-13 00:43:15.61492+00

*Old Replica ,*
node | slot_name | slot_type | temporary | active | plugin
| database | failover | synced | restart_lsn | confirmed_flush_lsn |
inactive_since
----------+--------------------+-----------+-----------+--------+----------+---------------+----------+--------+-------------+---------------------+-------------------------------
stand-by | kafka_logical_slot | logical | f | f | pgoutput
| replica_test | t | t | 0/6D000060 | 0/6D000098 |
2025-06-13 00:45:11.547671+00

Not sure if any Pre-Req missing in my test environment. Or any limitation .

I have tested this feature with Kafka/Debezium version 3.1 , Not sure if
that creates different way

Please share some light here ,

Thanks for the time.

On Thu, Jun 12, 2025 at 8:09 AM Perumal Raj <perucinci@gmail.com> wrote:

Show quoted text

Thanks Hou zj

I will capture log message and share it with you,

temporary column marked as 'false' every where , But synced column marked
as 'false' in Primary whereas it was 'true' in both Direct STANDBYs

#3 : No , I didn't drop slot at Primary. Infact Secondary B ( Another
Direct Standby) still showing slot.

On Thu, Jun 12, 2025 at 1:44 AM Zhijie Hou (Fujitsu) <
houzj.fnst@fujitsu.com> wrote:

On Thu, Jun 12, 2025 at 4:08 PM Perumal Raj wrote:

Hi Community,

I have installed postgres version 17.5 with following setup,

Primary
-- Secondary A
-- Secondary B
-- Secondary C

Config:
wal_level = 'logical'
max_wal_senders = '10'
max_replication_slots = '10'
wal_keep_size = '512MB'
hot_standby = 'on'
sync_replication_slots = 'on'
hot_standby_feedback = 'on'
synchronized_standby_slots = 'Kafka_logical_slot'

1. slotsync worker is running all the time ( Automatic sync)
2. When I create logical replication slot(Kafka_logical_slot) in

Primary, it

got synced on both Secondary A and > Secondary B
3. It didn't appear in Secondary C , Since its not direct replica.

Issue : When I stop Primary node and promote one of the Direct secondary
(A,B) node. logical replication slot is vanished.

Am I missing any configuration ?

Please share your experience.

Thanks for reporting.

To narrow down potential causes, please confirm the following:

1) One possibility is that the slot has not been successfully
synchronized to
the standby. To verify, check for the presence of the following log
message:

LOG: newly created replication slot "your_slot" is sync-ready now

If this message is absent, it indicates that the slot has not been
successfully
synced. Additionally, you can confirm the sync status by inspecting the
pg_replication_slots.temporary field on the standby; a value of true
suggests
that the slot sync has not completed.

2) We typically recommend specifying the primary_slot_name on the standby
to
prevent slot invalidation due to catalog row removal on the primary.
Please
check your logs for possible invalidation messages:

LOG: invalidating obsolete replication slot "your_slot"
or
LOG: terminating process 12344 to release replication slot "your_slot"

3) Is there a chance that the slot was dropped on the primary before
stopping
it and promoting the standby? If so, the synced slot would also be
dropped
in this scenario.

Best Regards,
Hou zj

#27shveta malik
shveta malik
shveta.malik@gmail.com
In reply to: Perumal Raj (#26)
Re: Logical Replication slot disappeared after promote Standby

On Fri, Jun 13, 2025 at 6:23 AM Perumal Raj <perucinci@gmail.com> wrote:

Hi Hou zj

I have found some strange issue , but not sure if I am doing anything wrong.

I am able to see logical slot at STANDBY even after promote. 👏

Good to know.

Importantly Logical replication slot is persistance in STANDBYs which already established connection with Primary before logical replication slot creation.

But If I create any new replica(Direct to Primary) after logical replication slot creation, then its not persistance(temporary=true) .

New Replica :
node | slot_name | slot_type | temporary | active | plugin | database | failover | synced | restart_lsn | confirmed_flush_lsn | inactive_since
----------+--------------------+-----------+-----------+--------+----------+---------------+----------+--------+-------------+---------------------+------------------------------
stand-by | kafka_logical_slot | logical | t | t | pgoutput | replica_test | t | t | 0/6C000000 | | 2025-06-13 00:43:15.61492+00

Old Replica ,
node | slot_name | slot_type | temporary | active | plugin | database | failover | synced | restart_lsn | confirmed_flush_lsn | inactive_since
----------+--------------------+-----------+-----------+--------+----------+---------------+----------+--------+-------------+---------------------+-------------------------------
stand-by | kafka_logical_slot | logical | f | f | pgoutput | replica_test | t | t | 0/6D000060 | 0/6D000098 | 2025-06-13 00:45:11.547671+00

Not sure if any Pre-Req missing in my test environment. Or any limitation .

It may be a possibility that the slot is not sync-ready yet (and thus
not persisted) on new-replica due to primary having older values of
xmin and lsn. We do not allow persisting a synced slot if the
required WAL or catalog rows for this slot have been removed or are at
risk of removal on standby. The slot will be persisted in the next
few cycles of automatic slot-synchronization when it is ensured that
source slot's values are safe to be synced to the standby. But to
confirm my diagnosis, please provide this information:

1)
Output of this query on both primary and new-replica (where slot is temporary)
select slot_name, failover, synced, temporary, catalog_xmin,
restart_lsn, confirmed_flush_lsn from pg_replication_slots;

2)
Please check logs on new-replica to see the presence of log:
LOG: could not synchronize replication slot "kafka_logical_slot".

If found, please provide us with both the LOG and DETAIL messages
dumped in the log file.

thanks
Shveta

#28Perumal Raj
Perumal Raj
perucinci@gmail.com
In reply to: shveta malik (#27)
Re: Logical Replication slot disappeared after promote Standby

Yes Shveta!

I could see repeated message in New-replica .

2025-06-13 06:20:30.146 UTC [277861] LOG: could not synchronize
replication slot "kafka_logical_slot" because remote slot precedes local
slot
2025-06-13 06:20:30.146 UTC [277861] DETAIL: The remote slot has LSN
0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.
2025-06-13 06:21:00.176 UTC [277861] LOG: could not synchronize
replication slot "kafka_logical_slot" because remote slot precedes local
slot
2025-06-13 06:21:00.176 UTC [277861] DETAIL: The remote slot has LSN
0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.
2025-06-13 06:21:30.207 UTC [277861] LOG: could not synchronize
replication slot "kafka_logical_slot" because remote slot precedes local
slot
2025-06-13 06:21:30.207 UTC [277861] DETAIL: The remote slot has LSN
0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.
2025-06-13 06:22:00.238 UTC [277861] LOG: could not synchronize
replication slot "kafka_logical_slot" because remote slot precedes local
slot
2025-06-13 06:22:00.238 UTC [277861] DETAIL: The remote slot has LSN
0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.
2025-06-13 06:22:30.268 UTC [277861] LOG: could not synchronize
replication slot "kafka_logical_slot" because remote slot precedes local
slot
2025-06-13 06:22:30.268 UTC [277861] DETAIL: The remote slot has LSN
0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.
2025-06-13 06:23:00.299 UTC [277861] LOG: could not synchronize
replication slot "kafka_logical_slot" because remote slot precedes local
slot
2025-06-13 06:23:00.299 UTC [277861] DETAIL: The remote slot has LSN
0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.
2025-06-13 06:23:30.329 UTC [277861] LOG: could not synchronize
replication slot "kafka_logical_slot" because remote slot precedes local
slot
2025-06-13 06:23:30.329 UTC [277861] DETAIL: The remote slot has LSN
0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.
2025-06-13 06:24:00.360 UTC [277861] LOG: could not synchronize
replication slot "kafka_logical_slot" because remote slot precedes local
slot
2025-06-13 06:24:00.360 UTC [277861] DETAIL: The remote slot has LSN
0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.
2025-06-13 06:24:30.391 UTC [277861] LOG: could not synchronize
replication slot "kafka_logical_slot" because remote slot precedes local
slot
2025-06-13 06:24:30.391 UTC [277861] DETAIL: The remote slot has LSN
0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.
2025-06-13 06:25:00.421 UTC [277861] LOG: could not synchronize
replication slot "kafka_logical_slot" because remote slot precedes local
slot
2025-06-13 06:25:00.421 UTC [277861] DETAIL: The remote slot has LSN
0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.
2025-06-13 06:25:30.452 UTC [277861] LOG: could not synchronize
replication slot "kafka_logical_slot" because remote slot precedes local
slot
2025-06-13 06:25:30.452 UTC [277861] DETAIL: The remote slot has LSN
0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.

It appears that my Debezium connectors have stopped consuming data,
resulting in an outdated restart_lsn of "0/6D0000B8".

In contrast, the New_replica has a restart_lsn that matches the primary
server's most recent confirmed_flush_lsn, indicating it is up to date.

*As soon as I recreate that replication slot, it got sync with
New_Replica(temporary=false) .*

2025-06-13 06:26:00.484 UTC [277861] LOG: dropped replication slot
"kafka_logical_slot" of database with OID 16384

2025-06-13 06:26:30.520 UTC [277861] LOG: starting logical decoding for
slot "kafka_logical_slot"

2025-06-13 06:26:30.520 UTC [277861] DETAIL: Streaming transactions
committing after 0/0, reading WAL from 0/76003140.

2025-06-13 06:26:30.520 UTC [277861] LOG: logical decoding found
consistent point at 0/76003140

2025-06-13 06:26:30.520 UTC [277861] DETAIL: There are no running
transactions.

2025-06-13 06:26:30.526 UTC [277861] LOG: newly created replication slot
"kafka_logical_slot" is sync-ready now

2025-06-13 06:35:39.212 UTC [277857] LOG: restartpoint starting: time

2025-06-13 06:35:42.022 UTC [277857] LOG: restartpoint complete: wrote 29
buffers (0.2%); 0 WAL file(s) added, 0 removed, 0 recycled; write=2.805 s,
sync=0.002 s, total=2.810 s; sync files=26, longest=0.002 s, average=0.001
s; distance=16496 kB, estimate=16496 kB; lsn=0/7701F480, redo lsn=0/7701F428

2025-06-13 06:35:42.022 UTC [277857] LOG: recovery restart point at
0/7701F428

2025-06-13 06:35:42.022 UTC [277857] DETAIL: Last completed transaction
was at log time 2025-06-13 06:33:31.675341+00.

*Until the synchronization is complete, the slot type is marked as
temporary=true, as you mentioned.*
is there any manual way to advance *"**restart_lsn"* of logical
replication slot ? This is to ensure slot synchronization.

Thanks,

On Thu, Jun 12, 2025 at 8:49 PM shveta malik <shveta.malik@gmail.com> wrote:

Show quoted text

On Fri, Jun 13, 2025 at 6:23 AM Perumal Raj <perucinci@gmail.com> wrote:

Hi Hou zj

I have found some strange issue , but not sure if I am doing anything

wrong.

I am able to see logical slot at STANDBY even after promote. 👏

Good to know.

Importantly Logical replication slot is persistance in STANDBYs which

already established connection with Primary before logical replication slot
creation.

But If I create any new replica(Direct to Primary) after logical

replication slot creation, then its not persistance(temporary=true) .

New Replica :
node | slot_name | slot_type | temporary | active |

plugin | database | failover | synced | restart_lsn |
confirmed_flush_lsn | inactive_since

----------+--------------------+-----------+-----------+--------+----------+---------------+----------+--------+-------------+---------------------+------------------------------

stand-by | kafka_logical_slot | logical | t | t |

pgoutput | replica_test | t | t | 0/6C000000 |
| 2025-06-13 00:43:15.61492+00

Old Replica ,
node | slot_name | slot_type | temporary | active |

plugin | database | failover | synced | restart_lsn |
confirmed_flush_lsn | inactive_since

----------+--------------------+-----------+-----------+--------+----------+---------------+----------+--------+-------------+---------------------+-------------------------------

stand-by | kafka_logical_slot | logical | f | f |

pgoutput | replica_test | t | t | 0/6D000060 | 0/6D000098
| 2025-06-13 00:45:11.547671+00

Not sure if any Pre-Req missing in my test environment. Or any

limitation .

It may be a possibility that the slot is not sync-ready yet (and thus
not persisted) on new-replica due to primary having older values of
xmin and lsn. We do not allow persisting a synced slot if the
required WAL or catalog rows for this slot have been removed or are at
risk of removal on standby. The slot will be persisted in the next
few cycles of automatic slot-synchronization when it is ensured that
source slot's values are safe to be synced to the standby. But to
confirm my diagnosis, please provide this information:

1)
Output of this query on both primary and new-replica (where slot is
temporary)
select slot_name, failover, synced, temporary, catalog_xmin,
restart_lsn, confirmed_flush_lsn from pg_replication_slots;

2)
Please check logs on new-replica to see the presence of log:
LOG: could not synchronize replication slot "kafka_logical_slot".

If found, please provide us with both the LOG and DETAIL messages
dumped in the log file.

thanks
Shveta

#29shveta malik
shveta malik
shveta.malik@gmail.com
In reply to: Perumal Raj (#28)
Re: Logical Replication slot disappeared after promote Standby

On Fri, Jun 13, 2025 at 1:00 PM Perumal Raj <perucinci@gmail.com> wrote:

Yes Shveta!

I could see repeated message in New-replica .

2025-06-13 06:20:30.146 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:20:30.146 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
2025-06-13 06:21:00.176 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:21:00.176 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
2025-06-13 06:21:30.207 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:21:30.207 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
2025-06-13 06:22:00.238 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:22:00.238 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
2025-06-13 06:22:30.268 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:22:30.268 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
2025-06-13 06:23:00.299 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:23:00.299 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
2025-06-13 06:23:30.329 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:23:30.329 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
2025-06-13 06:24:00.360 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:24:00.360 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
2025-06-13 06:24:30.391 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:24:30.391 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
2025-06-13 06:25:00.421 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:25:00.421 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
2025-06-13 06:25:30.452 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
2025-06-13 06:25:30.452 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.

It appears that my Debezium connectors have stopped consuming data, resulting in an outdated restart_lsn of "0/6D0000B8".

Yes, if there are no consumers consuming the changes on the failover
slot on primary, and meanwhile slot synchronization is started, the
initial sync may have such a temporary state of synced slot. This is
intentionally done to prevent the inconsistent state of the synced
slot and avoid unexpected behaviour if failover is performed at that
moment.

In contrast, the New_replica has a restart_lsn that matches the primary server's most recent confirmed_flush_lsn, indicating it is up to date.

As soon as I recreate that replication slot, it got sync with New_Replica(temporary=false) .

2025-06-13 06:26:00.484 UTC [277861] LOG: dropped replication slot "kafka_logical_slot" of database with OID 16384

2025-06-13 06:26:30.520 UTC [277861] LOG: starting logical decoding for slot "kafka_logical_slot"

2025-06-13 06:26:30.520 UTC [277861] DETAIL: Streaming transactions committing after 0/0, reading WAL from 0/76003140.

2025-06-13 06:26:30.520 UTC [277861] LOG: logical decoding found consistent point at 0/76003140

2025-06-13 06:26:30.520 UTC [277861] DETAIL: There are no running transactions.

2025-06-13 06:26:30.526 UTC [277861] LOG: newly created replication slot "kafka_logical_slot" is sync-ready now

2025-06-13 06:35:39.212 UTC [277857] LOG: restartpoint starting: time

2025-06-13 06:35:42.022 UTC [277857] LOG: restartpoint complete: wrote 29 buffers (0.2%); 0 WAL file(s) added, 0 removed, 0 recycled; write=2.805 s, sync=0.002 s, total=2.810 s; sync files=26, longest=0.002 s, average=0.001 s; distance=16496 kB, estimate=16496 kB; lsn=0/7701F480, redo lsn=0/7701F428

2025-06-13 06:35:42.022 UTC [277857] LOG: recovery restart point at 0/7701F428

2025-06-13 06:35:42.022 UTC [277857] DETAIL: Last completed transaction was at log time 2025-06-13 06:33:31.675341+00.

Until the synchronization is complete, the slot type is marked as temporary=true, as you mentioned.

is there any manual way to advance "restart_lsn" of logical replication slot ? This is to ensure slot synchronization.

1) The first and recommended option is to get the connector running
again and let it advance the slot by consuming the changes.

2) Another option is to manually advance the slot on the primary by
using pg_logical_slot_get_binary_changes(). However, if the logical
replication setup is intended to consume these changes but is
currently inactive, then slot's consumer will not be able to reprocess
those changes upon restarting. So the said API should be used only
after analyzing the current state of logical replication setup and if
we are okay with those changes not shipped to logical replication
consumers.

thanks
Shveta

#30Perumal Raj
Perumal Raj
perucinci@gmail.com
In reply to: shveta malik (#29)
Re: Logical Replication slot disappeared after promote Standby

Thanks for explanation Shveta!

------------
*As Summary in this original thread,*

1.

*Prerequisites for Setting Up a Logical Replication Slot sync in >= pg17*

To successfully configure a logical replication slot, ensure the
following settings are applied:

wal_level = 'logical'
hot_standby = 'on'
hot_standby_feedback = 'on'
sync_replication_slots = 'on'

2.

*Replication Slot Synchronization*

Logical replication slots can synchronize with all direct standby
servers of the primary but are not compatible with cascade standby servers.
3.

*Temporary Status of New Standby Slots*

If a new standby server is created after the logical replication slot,
it will be marked as temporary=true until the reset_lsn of the primary
matches the confirmed_lsn of the new standby.
4.

*Limitations on Using Logical Replication Slots*

While logical replication slots can synchronize on the direct standby
side, they cannot be utilized (as in the case of Debezium) until the
standby server is promoted to primary. Attempting to use a synchronized
logical slot on a standby server will result in the following error:

org.postgresql.util.PSQLException: ERROR: cannot use replication
slot "kafka_logical_slot" for logical decoding
Detail: This replication slot is being synchronized from the primary server.

*Add on in ths thread,*

*We can advance the reset_lsn of a logical slot using the
pg_logical_slot_get_changes function. However, there is a limitation
regarding the plugin type (specifically, pgoutput).*

replica_test=# SELECT * FROM
pg_logical_slot_get_changes('kafka_logical_slot', NULL, NULL);

ERROR: option "proto_version" missing
CONTEXT: slot "kafka_logical_slot", output plugin "pgoutput", in the
startup callback

*Next, we can create a logical replication slot:*

replica_test=# SELECT pg_create_logical_replication_slot('test',
'test_decoding', false, true, true);

pg_create_logical_replication_slot
------------------------------------
(test, 0/7B001AA0)

*Now, let's attempt to retrieve changes from the new slot:*

replica_test=# SELECT * FROM pg_logical_slot_get_changes('test', NULL, NULL);

WARNING: cannot specify logical replication slot "kafka_logical_slot"
in parameter "synchronized_standby_slots"
DETAIL: Logical replication is waiting for correction on replication
slot "kafka_logical_slot".
HINT: Remove the logical replication slot "kafka_logical_slot" from
parameter "synchronized_standby_slots".

*To resolve this, we will alter the system settings:*

replica_test=# ALTER SYSTEM SET synchronized_standby_slots = '';

*Finally, we can check for changes again:*

replica_test=# SELECT * FROM pg_logical_slot_get_changes('test', NULL, NULL);

lsn | xid | data
-------------+------+----------------------------------------------
0/7B001AA0 | 1218 | BEGIN 1218
0/7B00B9D0 | 1218 | table public.customers_1: TRUNCATE: (no-flags)
0/7B00BB70 | 1218 | COMMIT 1218

Thanks Shveta, Zhijie Hou

Please correct me if needed.

On Fri, Jun 13, 2025 at 2:51 AM shveta malik <shveta.malik@gmail.com> wrote:

Show quoted text

On Fri, Jun 13, 2025 at 1:00 PM Perumal Raj <perucinci@gmail.com> wrote:

Yes Shveta!

I could see repeated message in New-replica .

2025-06-13 06:20:30.146 UTC [277861] LOG: could not synchronize

replication slot "kafka_logical_slot" because remote slot precedes local
slot

2025-06-13 06:20:30.146 UTC [277861] DETAIL: The remote slot has LSN

0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.

2025-06-13 06:21:00.176 UTC [277861] LOG: could not synchronize

replication slot "kafka_logical_slot" because remote slot precedes local
slot

2025-06-13 06:21:00.176 UTC [277861] DETAIL: The remote slot has LSN

0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.

2025-06-13 06:21:30.207 UTC [277861] LOG: could not synchronize

replication slot "kafka_logical_slot" because remote slot precedes local
slot

2025-06-13 06:21:30.207 UTC [277861] DETAIL: The remote slot has LSN

0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.

2025-06-13 06:22:00.238 UTC [277861] LOG: could not synchronize

replication slot "kafka_logical_slot" because remote slot precedes local
slot

2025-06-13 06:22:00.238 UTC [277861] DETAIL: The remote slot has LSN

0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.

2025-06-13 06:22:30.268 UTC [277861] LOG: could not synchronize

replication slot "kafka_logical_slot" because remote slot precedes local
slot

2025-06-13 06:22:30.268 UTC [277861] DETAIL: The remote slot has LSN

0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.

2025-06-13 06:23:00.299 UTC [277861] LOG: could not synchronize

replication slot "kafka_logical_slot" because remote slot precedes local
slot

2025-06-13 06:23:00.299 UTC [277861] DETAIL: The remote slot has LSN

0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.

2025-06-13 06:23:30.329 UTC [277861] LOG: could not synchronize

replication slot "kafka_logical_slot" because remote slot precedes local
slot

2025-06-13 06:23:30.329 UTC [277861] DETAIL: The remote slot has LSN

0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.

2025-06-13 06:24:00.360 UTC [277861] LOG: could not synchronize

replication slot "kafka_logical_slot" because remote slot precedes local
slot

2025-06-13 06:24:00.360 UTC [277861] DETAIL: The remote slot has LSN

0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.

2025-06-13 06:24:30.391 UTC [277861] LOG: could not synchronize

replication slot "kafka_logical_slot" because remote slot precedes local
slot

2025-06-13 06:24:30.391 UTC [277861] DETAIL: The remote slot has LSN

0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.

2025-06-13 06:25:00.421 UTC [277861] LOG: could not synchronize

replication slot "kafka_logical_slot" because remote slot precedes local
slot

2025-06-13 06:25:00.421 UTC [277861] DETAIL: The remote slot has LSN

0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.

2025-06-13 06:25:30.452 UTC [277861] LOG: could not synchronize

replication slot "kafka_logical_slot" because remote slot precedes local
slot

2025-06-13 06:25:30.452 UTC [277861] DETAIL: The remote slot has LSN

0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and
catalog xmin 1088.

It appears that my Debezium connectors have stopped consuming data,

resulting in an outdated restart_lsn of "0/6D0000B8".

Yes, if there are no consumers consuming the changes on the failover
slot on primary, and meanwhile slot synchronization is started, the
initial sync may have such a temporary state of synced slot. This is
intentionally done to prevent the inconsistent state of the synced
slot and avoid unexpected behaviour if failover is performed at that
moment.

In contrast, the New_replica has a restart_lsn that matches the primary

server's most recent confirmed_flush_lsn, indicating it is up to date.

As soon as I recreate that replication slot, it got sync with

New_Replica(temporary=false) .

2025-06-13 06:26:00.484 UTC [277861] LOG: dropped replication slot

"kafka_logical_slot" of database with OID 16384

2025-06-13 06:26:30.520 UTC [277861] LOG: starting logical decoding for

slot "kafka_logical_slot"

2025-06-13 06:26:30.520 UTC [277861] DETAIL: Streaming transactions

committing after 0/0, reading WAL from 0/76003140.

2025-06-13 06:26:30.520 UTC [277861] LOG: logical decoding found

consistent point at 0/76003140

2025-06-13 06:26:30.520 UTC [277861] DETAIL: There are no running

transactions.

2025-06-13 06:26:30.526 UTC [277861] LOG: newly created replication

slot "kafka_logical_slot" is sync-ready now

2025-06-13 06:35:39.212 UTC [277857] LOG: restartpoint starting: time

2025-06-13 06:35:42.022 UTC [277857] LOG: restartpoint complete: wrote

29 buffers (0.2%); 0 WAL file(s) added, 0 removed, 0 recycled; write=2.805
s, sync=0.002 s, total=2.810 s; sync files=26, longest=0.002 s,
average=0.001 s; distance=16496 kB, estimate=16496 kB; lsn=0/7701F480, redo
lsn=0/7701F428

2025-06-13 06:35:42.022 UTC [277857] LOG: recovery restart point at

0/7701F428

2025-06-13 06:35:42.022 UTC [277857] DETAIL: Last completed transaction

was at log time 2025-06-13 06:33:31.675341+00.

Until the synchronization is complete, the slot type is marked as

temporary=true, as you mentioned.

is there any manual way to advance "restart_lsn" of logical replication

slot ? This is to ensure slot synchronization.

1) The first and recommended option is to get the connector running
again and let it advance the slot by consuming the changes.

2) Another option is to manually advance the slot on the primary by
using pg_logical_slot_get_binary_changes(). However, if the logical
replication setup is intended to consume these changes but is
currently inactive, then slot's consumer will not be able to reprocess
those changes upon restarting. So the said API should be used only
after analyzing the current state of logical replication setup and if
we are okay with those changes not shipped to logical replication
consumers.

thanks
Shveta

#31Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: Perumal Raj (#30)
Re: Logical Replication slot disappeared after promote Standby

On Fri, Jun 13, 2025 at 10:52 PM Perumal Raj <perucinci@gmail.com> wrote:

Thanks for explanation Shveta!

------------
As Summary in this original thread,

Prerequisites for Setting Up a Logical Replication Slot sync in >= pg17

To successfully configure a logical replication slot, ensure the following settings are applied:

wal_level = 'logical'
hot_standby = 'on'
hot_standby_feedback = 'on'
sync_replication_slots = 'on'

Additionally, you need to configure primary_slot_name on the standby
and have dbname in primary_conninfo. For further details, you can
refer docs (1)(2).

Replication Slot Synchronization

Logical replication slots can synchronize with all direct standby servers of the primary but are not compatible with cascade standby servers.

Temporary Status of New Standby Slots

If a new standby server is created after the logical replication slot, it will be marked as temporary=true until the reset_lsn of the primary matches the confirmed_lsn of the new standby.

It is restart_lsn on both nodes, but there are other things like
slot's catalog_xmin as well. As a user, you need to ensure that your
primary's logical slot is being consumed. And this is required
primarily at the initial sync time so that we sync the slot only if
the standby has required resources like WAL to allow decoding from the
synced slot after failover.

Limitations on Using Logical Replication Slots

While logical replication slots can synchronize on the direct standby side, they cannot be utilized (as in the case of Debezium) until the standby server is promoted to primary. Attempting to use a synchronized logical slot on a standby server will result in the following error:

org.postgresql.util.PSQLException: ERROR: cannot use replication slot "kafka_logical_slot" for logical decoding
Detail: This replication slot is being synchronized from the primary server.

I don't think we can call this a limitation. According to me, this is
a requirement for this feature to work. Consider if we allow the use
of this synced slot for decoding when sync is still in-progress, this
slot could be advanced ahead of the primary. Now, after the failover,
we won't be able to reuse this slot to allow the subscribers to
continue replication.

(1) - https://www.postgresql.org/docs/devel/logicaldecoding-explanation.html#LOGICALDECODING-REPLICATION-SLOTS-SYNCHRONIZATION
(2) - https://www.postgresql.org/docs/devel/logical-replication-failover.html

--
With Regards,
Amit Kapila.

#32Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: shveta malik (#23)
Re: Replication slot is not able to sync up

On Thu, Jun 12, 2025 at 10:44 AM shveta malik <shveta.malik@gmail.com> wrote:

On Thu, Jun 12, 2025 at 4:13 AM Peter Smith <smithpb2250@gmail.com> wrote:

Phrases like "... it is recommended..." and "... intended for testing
and debugging .. " and "... should be used with caution." and "... it
is advisable to..." seem like indicators that parts of the above
description should be using SGML markup such as <caution> or <warning>
or <note> instead of just plain text.

I feel WARNING and CAUTION markups could be a little strong for the
concerned case. Such markups are generally used when there is a
side-effect involved with the usage. But in our case, there is no such
side-effect with the API. At max it may fail without harming the
system and will succeed in the next invocation. But I also feel that
such sections catch user attention. Thus if needed, we can have a NOTE
section to convey the recommended way of slot synchronization.

I think NOTE is fine for API in this case, but we can mention that the
API is more prone to get the synchronization failure message, as you
have shown in the patch. It would also be better to briefly explain in
user terms why the API is more prone to such a failure.

--
With Regards,
Amit Kapila.

#33Perumal Raj
Perumal Raj
perucinci@gmail.com
In reply to: Amit Kapila (#31)
Resolved: Logical Replication slot disappeared after promote Standby

On Fri, Jun 13, 2025 at 9:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jun 13, 2025 at 10:52 PM Perumal Raj <perucinci@gmail.com> wrote:

Thanks for explanation Shveta!

------------
As Summary in this original thread,

Prerequisites for Setting Up a Logical Replication Slot sync in >= pg17

To successfully configure a logical replication slot, ensure the

following settings are applied:

wal_level = 'logical'
hot_standby = 'on'
hot_standby_feedback = 'on'
sync_replication_slots = 'on'

Additionally, you need to configure primary_slot_name on the standby
and have dbname in primary_conninfo. For further details, you can
refer docs (1)(2).

*Thanks for the notes here. *
*Yes, those parameters configured as part of Normal Physical replication
setup. *

Replication Slot Synchronization

Logical replication slots can synchronize with all direct standby

servers of the primary but are not compatible with cascade standby servers.

Temporary Status of New Standby Slots

If a new standby server is created after the logical replication slot,

it will be marked as temporary=true until the reset_lsn of the primary
matches the confirmed_lsn of the new standby.

It is restart_lsn on both nodes, but there are other things like
slot's catalog_xmin as well. As a user, you need to ensure that your
primary's logical slot is being consumed. And this is required
primarily at the initial sync time so that we sync the slot only if
the standby has required resources like WAL to allow decoding from the
synced slot after failover.

*Thanks for correcting,*

Limitations on Using Logical Replication Slots

While logical replication slots can synchronize on the direct standby

side, they cannot be utilized (as in the case of Debezium) until the
standby server is promoted to primary. Attempting to use a synchronized
logical slot on a standby server will result in the following error:

org.postgresql.util.PSQLException: ERROR: cannot use replication slot

"kafka_logical_slot" for logical decoding

Detail: This replication slot is being synchronized from the primary

server.

I don't think we can call this a limitation. According to me, this is
a requirement for this feature to work. Consider if we allow the use
of this synced slot for decoding when sync is still in-progress, this
slot could be advanced ahead of the primary. Now, after the failover,
we won't be able to reuse this slot to allow the subscribers to
continue replication.

*Make sense, *

Show quoted text

(1) -
https://www.postgresql.org/docs/devel/logicaldecoding-explanation.html#LOGICALDECODING-REPLICATION-SLOTS-SYNCHRONIZATION
(2) -
https://www.postgresql.org/docs/devel/logical-replication-failover.html

--
With Regards,
Amit Kapila.

#34Dilip Kumar
Dilip Kumar
dilipbalaut@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#13)
Re: Replication slot is not able to sync up

On Fri, May 30, 2025 at 3:38 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:

On Wed, May 28, 2025 at 2:09 AM Masahiko Sawada wrote:

On Fri, May 23, 2025 at 10:07 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

In the case presented here, the logical slot is expected to keep
forwarding, and in the consecutive sync cycle, the sync should be
successful. Users using logical decoding APIs should also be aware
that if due for some reason, the logical slot is not moving forward,
the master/publisher node will start accumulating dead rows and WAL,
which can create bigger problems.

I've tried this case and am concerned that the slot synchronization using
pg_sync_replication_slots() would never succeed while the primary keeps
getting write transactions. Even if the user manually consumes changes on the
primary, the primary server keeps advancing its XID in the meanwhile. On the
standby, we ensure that the
TransamVariables->nextXid is beyond the XID of WAL record that it's
going to apply so the xmin horizon calculated by
GetOldestSafeDecodingTransactionId() ends up always being higher than the
slot's catalog_xmin on the primary. We get the log message "could not
synchronize replication slot "s" because remote slot precedes local slot" and
cleanup the slot on the standby at the end of pg_sync_replication_slots().

To improve this workload scenario, we can modify pg_sync_replication_slots() to
wait for the primary slot to advance to a suitable position before completing
synchronization and removing the temporary slot. This would allow the sync to
complete as soon as the primary slot advances, whether through
pg_logical_xx_get_changes() or other ways.

I've created a POC (attached) that currently waits indefinitely for the remote
slot to catch up. We could later add a timeout parameter to control maximum
wait time if this approach seems acceptable.

I tested that, when pgbench TPC-B is running on the primary, calling
pg_sync_replication_slots() on the standby correctly blocks until I advance the
primary slot position by calling pg_logical_xx_get_changes().

if the basic idea sounds reasonable then I can start a separate
thread to extend this API. Thoughts ?

IMHO, this idea has merit, have you started a thread for reviewing this patch?

--
Regards,
Dilip Kumar
Google

#35Zhijie Hou (Fujitsu)
Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Dilip Kumar (#34)
RE: Replication slot is not able to sync up

On Sat, Jun 14, 2025 at 11:37 PM Dilip Kumar wrote:

On Fri, May 30, 2025 at 3:38 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
wrote:

On Wed, May 28, 2025 at 2:09 AM Masahiko Sawada wrote:

On Fri, May 23, 2025 at 10:07 PM Amit Kapila
<amit.kapila16@gmail.com>
wrote:

In the case presented here, the logical slot is expected to keep
forwarding, and in the consecutive sync cycle, the sync should be
successful. Users using logical decoding APIs should also be aware
that if due for some reason, the logical slot is not moving
forward, the master/publisher node will start accumulating dead
rows and WAL, which can create bigger problems.

I've tried this case and am concerned that the slot synchronization
using
pg_sync_replication_slots() would never succeed while the primary
keeps getting write transactions. Even if the user manually consumes
changes on the primary, the primary server keeps advancing its XID
in the meanwhile. On the standby, we ensure that the
TransamVariables->nextXid is beyond the XID of WAL record that it's
going to apply so the xmin horizon calculated by
GetOldestSafeDecodingTransactionId() ends up always being higher
than the slot's catalog_xmin on the primary. We get the log message
"could not synchronize replication slot "s" because remote slot
precedes local slot" and cleanup the slot on the standby at the end of

pg_sync_replication_slots().

To improve this workload scenario, we can modify
pg_sync_replication_slots() to wait for the primary slot to advance to
a suitable position before completing synchronization and removing the
temporary slot. This would allow the sync to complete as soon as the
primary slot advances, whether through
pg_logical_xx_get_changes() or other ways.

I've created a POC (attached) that currently waits indefinitely for
the remote slot to catch up. We could later add a timeout parameter to
control maximum wait time if this approach seems acceptable.

I tested that, when pgbench TPC-B is running on the primary, calling
pg_sync_replication_slots() on the standby correctly blocks until I
advance the primary slot position by calling pg_logical_xx_get_changes().

if the basic idea sounds reasonable then I can start a separate thread
to extend this API. Thoughts ?

IMHO, this idea has merit, have you started a thread for reviewing this patch?

Thank you for looking at it. I plan to start a new thread soon for the
upcoming commit fest, after some additional testing and documentation cleanup.

Best Regards,
Hou zj

#36shveta malik
shveta malik
shveta.malik@gmail.com
In reply to: Amit Kapila (#32)
1 attachment(s)
Re: Replication slot is not able to sync up

On Sat, Jun 14, 2025 at 11:08 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

I feel WARNING and CAUTION markups could be a little strong for the
concerned case. Such markups are generally used when there is a
side-effect involved with the usage. But in our case, there is no such
side-effect with the API. At max it may fail without harming the
system and will succeed in the next invocation. But I also feel that
such sections catch user attention. Thus if needed, we can have a NOTE
section to convey the recommended way of slot synchronization.

I think NOTE is fine for API in this case, but we can mention that the
API is more prone to get the synchronization failure message, as you
have shown in the patch. It would also be better to briefly explain in
user terms why the API is more prone to such a failure.

Thanks Peter and Amit for feedback. I have updated the patch.

thanks
Shveta

Attachments:

v5-0001-Improve-log-messages-and-docs-for-slotsync.patchapplication/octet-stream; name=v5-0001-Improve-log-messages-and-docs-for-slotsync.patch
#37Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: shveta malik (#36)
Re: Replication slot is not able to sync up

On Mon, Jun 16, 2025 at 9:27 AM shveta malik <shveta.malik@gmail.com> wrote:

Thanks Peter and Amit for feedback. I have updated the patch.

<para>
+     When slot-synchronization setup is done as recommended, and
+     slot-synchronization is performed the very first time either automatically
+     or by <link linkend="pg-sync-replication-slots">
+     <function>pg_sync_replication_slots</function></link>,
+     then for the synchronized slot to be created and persisted on the standby,
+     one condition must be met. The logical replication slot on the primary
+     must reach a state where the WALs and system catalog rows retained by
+     the slot are also present on the corresponding standby server. This is
+     needed to prevent any data loss and to allow logical replication
to continue
+
...

This whole paragraph sounds like a duplicate of its previous section,
and the line alignment in the first paragraph has some issues.

--
With Regards,
Amit Kapila.

#38shveta malik
shveta malik
shveta.malik@gmail.com
In reply to: Amit Kapila (#37)
1 attachment(s)
Re: Replication slot is not able to sync up

On Tue, Jun 17, 2025 at 12:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

This whole paragraph sounds like a duplicate of its previous section,
and the line alignment in the first paragraph has some issues.

Sorry for the wrong upload, duplicacy was the merge issue. Removed the
duplicate paragraph and corrected indentation.

thanks
Shveta

Attachments:

v6-0001-Improve-log-messages-and-docs-for-slotsync.patchapplication/octet-stream; name=v6-0001-Improve-log-messages-and-docs-for-slotsync.patch
#39Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: shveta malik (#38)
Re: Replication slot is not able to sync up

On Wed, Jun 18, 2025 at 8:52 AM shveta malik <shveta.malik@gmail.com> wrote:

On Tue, Jun 17, 2025 at 12:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

This whole paragraph sounds like a duplicate of its previous section,
and the line alignment in the first paragraph has some issues.

Sorry for the wrong upload, duplicacy was the merge issue. Removed the
duplicate paragraph and corrected indentation.

LGTM. I'll push this to HEAD and 17 tomorrow unless there are more
comments or objections.

--
With Regards,
Amit Kapila.

#40Amit Kapila
Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#39)
Re: Replication slot is not able to sync up

On Wed, Jun 18, 2025 at 10:56 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jun 18, 2025 at 8:52 AM shveta malik <shveta.malik@gmail.com> wrote:

On Tue, Jun 17, 2025 at 12:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

This whole paragraph sounds like a duplicate of its previous section,
and the line alignment in the first paragraph has some issues.

Sorry for the wrong upload, duplicacy was the merge issue. Removed the
duplicate paragraph and corrected indentation.

LGTM. I'll push this to HEAD and 17 tomorrow unless there are more
comments or objections.

Pushed.

--
With Regards,
Amit Kapila.