Restrict copying of invalidated replication slots
Hi,
Currently, we can copy an invalidated slot using the function
'pg_copy_logical_replication_slot'. As per the suggestion in the
thread [1]/messages/by-id/CAA4eK1Kw=vZ2FZ4DdrmOhuxOAL=2abaBm8hu_PsVN2Hd6UFP-w@mail.gmail.com, we should prohibit copying of such slots.
I have created a patch to address the issue.
[1]: /messages/by-id/CAA4eK1Kw=vZ2FZ4DdrmOhuxOAL=2abaBm8hu_PsVN2Hd6UFP-w@mail.gmail.com
Thanks and Regards,
Shlok Kyal
Attachments:
v1-0001-Restrict-copying-of-invalidated-replication-slots.patchapplication/octet-stream; name=v1-0001-Restrict-copying-of-invalidated-replication-slots.patchDownload
From 349df8eada0d3eae63b009e34691240aad8cf258 Mon Sep 17 00:00:00 2001
From: Shlok Kyal <shlok.kyal.oss@gmail.com>
Date: Tue, 4 Feb 2025 14:58:32 +0530
Subject: [PATCH v1] Restrict copying of invalidated replication slots
Currently we can copy an invalidated slot using function
'pg_copy_logical_replication_slot'. With this patch we will throw an
error in such case.
---
src/backend/replication/slotfuncs.c | 8 ++++++++
src/test/recovery/t/035_standby_logical_decoding.pl | 8 ++++++++
2 files changed, 16 insertions(+)
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 8be4b8c65b..f83a890c50 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -604,6 +604,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
ReplicationSlot second_slot_contents;
XLogRecPtr src_restart_lsn;
bool src_islogical;
+ bool src_isinvalidated;
bool temporary;
char *plugin;
Datum values[2];
@@ -661,6 +662,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
src_restart_lsn = first_slot_contents.data.restart_lsn;
temporary = (first_slot_contents.data.persistency == RS_TEMPORARY);
plugin = logical_slot ? NameStr(first_slot_contents.data.plugin) : NULL;
+ src_isinvalidated = (first_slot_contents.data.invalidated != RS_INVAL_NONE);
/* Check type of replication slot */
if (src_islogical != logical_slot)
@@ -678,6 +680,12 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot copy a replication slot that doesn't reserve WAL")));
+ /* We should not copy invalidated replication slots */
+ if (src_isinvalidated)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot copy an invalidated replication slot")));
+
/* Overwrite params from optional arguments */
if (PG_NARGS() >= 3)
temporary = PG_GETARG_BOOL(2);
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 505e85d1eb..efebf72bd6 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -553,6 +553,14 @@ $handle =
check_pg_recvlogical_stderr($handle,
"can no longer access replication slot \"vacuum_full_activeslot\"");
+# Attempting to copy an invalidated slot
+($result, $stdout, $stderr) = $node_standby->psql(
+ 'postgres',
+ qq[select pg_copy_logical_replication_slot('vacuum_full_inactiveslot', 'vacuum_full_inactiveslot_copy');],
+ replication => 'database');
+ok($stderr =~ /ERROR: cannot copy an invalidated replication slot/,
+ "invalidated slot cannot be copied");
+
# Turn hot_standby_feedback back on
change_hot_standby_feedback_and_wait_for_xmins(1, 1);
--
2.34.1
On Tue, 4 Feb 2025 at 15:27, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
Hi,
Currently, we can copy an invalidated slot using the function
'pg_copy_logical_replication_slot'. As per the suggestion in the
thread [1], we should prohibit copying of such slots.I have created a patch to address the issue.
This patch does not fix all the copy_replication_slot scenarios
completely, there is a very corner concurrency case where an
invalidated slot still gets copied:
+ /* We should not copy invalidated replication slots */
+ if (src_isinvalidated)
+ ereport(ERROR,
+
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot copy an invalidated
replication slot")));
Consider the following scenario:
step 1) Set up streaming replication between the primary and standby nodes.
step 2) Create a logical replication slot (test1) on the standby node.
step 3) Have a breakpoint in InvalidatePossiblyObsoleteSlot if cause
is RS_INVAL_WAL_LEVEL, no need to hold other invalidation causes or
add a sleep in InvalidatePossiblyObsoleteSlot function like below:
if (cause == RS_INVAL_WAL_LEVEL)
{
while (bsleep)
sleep(1);
}
step 4) Reduce wal_level on the primary to replica and restart the primary node.
step 5) SELECT 'copy' FROM pg_copy_logical_replication_slot('test1',
'test2'); -- It will wait till the lock held by
InvalidatePossiblyObsoleteSlot is released while trying to create a
slot.
step 6) Increase wal_level back to logical on the primary node and
restart the primary.
step 7) Now allow the invalidation to happen (continue the breakpoint
held at step 3), the replication control lock will be released and the
invalidated slot will be copied
After this:
postgres=# SELECT 'copy' FROM
pg_copy_logical_replication_slot('test1', 'test2');
?column?
----------
copy
(1 row)
-- The invalidated slot (test1) is copied successfully:
postgres=# select * from pg_replication_slots ;
slot_name | plugin | slot_type | datoid | database | temporary
| active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phas
e | inactive_since | conflicting |
invalidation_reason | failover | synced
-----------+---------------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+---------
--+----------------------------------+-------------+------------------------+----------+--------
test1 | test_decoding | logical | 5 | postgres | f
| f | | | 745 | 0/4029060 | 0/4029098
| lost | | f
| 2025-02-13 15:26:54.666725+05:30 | t |
wal_level_insufficient | f | f
test2 | test_decoding | logical | 5 | postgres | f
| f | | | 745 | 0/4029060 | 0/4029098
| reserved | | f
| 2025-02-13 15:30:30.477836+05:30 | f |
| f | f
(2 rows)
-- A subsequent attempt to decode changes from the invalidated slot
(test2) fails:
postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, NULL);
WARNING: detected write past chunk end in TXN 0x5e77e6c6f300
ERROR: logical decoding on standby requires "wal_level" >= "logical"
on the primary
-- Alternatively, the following error may occur:
postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, NULL);
WARNING: detected write past chunk end in TXN 0x582d1b2d6ef0
data
------------
BEGIN 744
COMMIT 744
(2 rows)
This is an edge case that can occur under specific conditions
involving replication slot invalidation when there is a huge lag
between primary and standby.
There might be a similar concurrency case for wal_removed too.
Regards,
Vignesh
Hi. Some review comments for patch v1-0001.
======
1. DOCS?
Shouldn't the documentation [1]https://www.postgresql.org/docs/current/functions-admin.html for pg_copy_logical_replication_slot()
and pg_copy_physical_replication_slot() be updated to mention this?
======
src/backend/replication/slotfuncs.c
2.
+ /* We should not copy invalidated replication slots */
+ if (src_isinvalidated)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot copy an invalidated replication slot")));
+
2a.
The "we should not" sounds more like a recommendation than an error.
Comment can just say the same as the as errmsg.
~
2b.
ereport does not need all these parentheses
~
2c.
I felt the errmsg should include the name of the slot.
~~~
2d.
AFAICT this code will emit the same error regardless of
logical/physical slot so maybe you need to modify following to cater
for both kinds of replication_slot:
- the commit message
- docs
- test code
======
src/test/recovery/t/035_standby_logical_decoding.pl
3.
+# Attempting to copy an invalidated slot
+($result, $stdout, $stderr) = $node_standby->psql(
/Attempting/Attempt/
======
[1]: https://www.postgresql.org/docs/current/functions-admin.html
Kind Regards,
Peter Smith.
Fujitsu Australia
On Thu, 13 Feb 2025 at 15:54, vignesh C <vignesh21@gmail.com> wrote:
On Tue, 4 Feb 2025 at 15:27, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
Hi,
Currently, we can copy an invalidated slot using the function
'pg_copy_logical_replication_slot'. As per the suggestion in the
thread [1], we should prohibit copying of such slots.I have created a patch to address the issue.
This patch does not fix all the copy_replication_slot scenarios completely, there is a very corner concurrency case where an invalidated slot still gets copied: + /* We should not copy invalidated replication slots */ + if (src_isinvalidated) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot copy an invalidated replication slot")));Consider the following scenario:
step 1) Set up streaming replication between the primary and standby nodes.
step 2) Create a logical replication slot (test1) on the standby node.
step 3) Have a breakpoint in InvalidatePossiblyObsoleteSlot if cause
is RS_INVAL_WAL_LEVEL, no need to hold other invalidation causes or
add a sleep in InvalidatePossiblyObsoleteSlot function like below:
if (cause == RS_INVAL_WAL_LEVEL)
{
while (bsleep)
sleep(1);
}
step 4) Reduce wal_level on the primary to replica and restart the primary node.
step 5) SELECT 'copy' FROM pg_copy_logical_replication_slot('test1',
'test2'); -- It will wait till the lock held by
InvalidatePossiblyObsoleteSlot is released while trying to create a
slot.
step 6) Increase wal_level back to logical on the primary node and
restart the primary.
step 7) Now allow the invalidation to happen (continue the breakpoint
held at step 3), the replication control lock will be released and the
invalidated slot will be copiedAfter this:
postgres=# SELECT 'copy' FROM
pg_copy_logical_replication_slot('test1', 'test2');
?column?
----------
copy
(1 row)-- The invalidated slot (test1) is copied successfully:
postgres=# select * from pg_replication_slots ;
slot_name | plugin | slot_type | datoid | database | temporary
| active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phas
e | inactive_since | conflicting |
invalidation_reason | failover | synced
-----------+---------------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+---------
--+----------------------------------+-------------+------------------------+----------+--------
test1 | test_decoding | logical | 5 | postgres | f
| f | | | 745 | 0/4029060 | 0/4029098
| lost | | f
| 2025-02-13 15:26:54.666725+05:30 | t |
wal_level_insufficient | f | f
test2 | test_decoding | logical | 5 | postgres | f
| f | | | 745 | 0/4029060 | 0/4029098
| reserved | | f
| 2025-02-13 15:30:30.477836+05:30 | f |
| f | f
(2 rows)-- A subsequent attempt to decode changes from the invalidated slot
(test2) fails:
postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, NULL);
WARNING: detected write past chunk end in TXN 0x5e77e6c6f300
ERROR: logical decoding on standby requires "wal_level" >= "logical"
on the primary-- Alternatively, the following error may occur:
postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, NULL);
WARNING: detected write past chunk end in TXN 0x582d1b2d6ef0
data
------------
BEGIN 744
COMMIT 744
(2 rows)This is an edge case that can occur under specific conditions
involving replication slot invalidation when there is a huge lag
between primary and standby.
There might be a similar concurrency case for wal_removed too.
Hi Vignesh,
Thanks for reviewing the patch.
I have tested the above scenario and was able to reproduce it. I have
fixed it in the v2 patch.
Currently we are taking a shared lock on ReplicationSlotControlLock.
This issue can be resolved if we take an exclusive lock instead.
Thoughts?
Thanks and Regards,
Shlok Kyal
Attachments:
v2-0001-Restrict-copying-of-invalidated-replication-slots.patchapplication/octet-stream; name=v2-0001-Restrict-copying-of-invalidated-replication-slots.patchDownload
From bb105104f249804a274c9cb2ffa9eaa9a0c41074 Mon Sep 17 00:00:00 2001
From: Shlok Kyal <shlok.kyal.oss@gmail.com>
Date: Tue, 4 Feb 2025 14:58:32 +0530
Subject: [PATCH v2] Restrict copying of invalidated replication slots
Currently we can copy an invalidated logical and physical replication slot
using functions 'pg_copy_logical_replication_slot' and
'pg_copy_physical_replication_slot' respectively.
With this patch we will throw an error in such cases.
---
doc/src/sgml/func.sgml | 4 +++-
src/backend/replication/slotfuncs.c | 11 ++++++++++-
src/test/recovery/t/035_standby_logical_decoding.pl | 9 +++++++++
3 files changed, 22 insertions(+), 2 deletions(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7efc81936a..c88885e419 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29342,7 +29342,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
The copied physical slot starts to reserve WAL from the same <acronym>LSN</acronym> as the
source slot.
<parameter>temporary</parameter> is optional. If <parameter>temporary</parameter>
- is omitted, the same value as the source slot is used.
+ is omitted, the same value as the source slot is used. Copy of an
+ invalidated physical replication slot in not allowed.
</para></entry>
</row>
@@ -29364,6 +29365,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
from the same <acronym>LSN</acronym> as the source logical slot. Both
<parameter>temporary</parameter> and <parameter>plugin</parameter> are
optional; if they are omitted, the values of the source slot are used.
+ Copy of an invalidated logical replication slot in not allowed.
</para></entry>
</row>
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 8be4b8c65b..8a6365c1f8 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -604,6 +604,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
ReplicationSlot second_slot_contents;
XLogRecPtr src_restart_lsn;
bool src_islogical;
+ bool src_isinvalidated;
bool temporary;
char *plugin;
Datum values[2];
@@ -622,7 +623,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
else
CheckSlotRequirements();
- LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+ LWLockAcquire(ReplicationSlotControlLock, LW_EXCLUSIVE);
/*
* We need to prevent the source slot's reserved WAL from being removed,
@@ -661,6 +662,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
src_restart_lsn = first_slot_contents.data.restart_lsn;
temporary = (first_slot_contents.data.persistency == RS_TEMPORARY);
plugin = logical_slot ? NameStr(first_slot_contents.data.plugin) : NULL;
+ src_isinvalidated = (first_slot_contents.data.invalidated != RS_INVAL_NONE);
/* Check type of replication slot */
if (src_islogical != logical_slot)
@@ -678,6 +680,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot copy a replication slot that doesn't reserve WAL")));
+ /* Cannot copy an invalidated replication slot */
+ if (src_isinvalidated)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot copy invalidated replication slot \"%s\"",
+ NameStr(*src_name)));
+
/* Overwrite params from optional arguments */
if (PG_NARGS() >= 3)
temporary = PG_GETARG_BOOL(2);
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 505e85d1eb..23b235412a 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -553,6 +553,15 @@ $handle =
check_pg_recvlogical_stderr($handle,
"can no longer access replication slot \"vacuum_full_activeslot\"");
+# Attempt to copy an invalidated logical replication slot
+($result, $stdout, $stderr) = $node_standby->psql(
+ 'postgres',
+ qq[select pg_copy_logical_replication_slot('vacuum_full_inactiveslot', 'vacuum_full_inactiveslot_copy');],
+ replication => 'database');
+ok( $stderr =~
+ /ERROR: cannot copy invalidated replication slot "vacuum_full_inactiveslot"/,
+ "invalidated slot cannot be copied");
+
# Turn hot_standby_feedback back on
change_hot_standby_feedback_and_wait_for_xmins(1, 1);
--
2.34.1
On Mon, 17 Feb 2025 at 10:37, Peter Smith <smithpb2250@gmail.com> wrote:
Hi. Some review comments for patch v1-0001.
======
1. DOCS?Shouldn't the documentation [1] for pg_copy_logical_replication_slot()
and pg_copy_physical_replication_slot() be updated to mention this?======
src/backend/replication/slotfuncs.c
Updated the documentation
2. + /* We should not copy invalidated replication slots */ + if (src_isinvalidated) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot copy an invalidated replication slot"))); +2a.
The "we should not" sounds more like a recommendation than an error.
Comment can just say the same as the as errmsg.
Fixed
~
2b.
ereport does not need all these parentheses
Removed extra parentheses
~
2c.
I felt the errmsg should include the name of the slot.
Added the slot name in error message
~~~
2d.
AFAICT this code will emit the same error regardless of
logical/physical slot so maybe you need to modify following to cater
for both kinds of replication_slot:
- the commit message
Fixed
- docs
Fixed
- test code
Currently a physical replication slot can only be invalidated for
"wal_removed". And logical replication slot can be invalidated for
"wal_removed", "rows_removed" , "wal_level_insufficient".
And for copying the slot invalidated for "wal_removed" throws error
"ERROR: cannot copy a replication slot that doesn't reserve WAL".
So, I have added test only for the case of logical replication slot.
======
src/test/recovery/t/035_standby_logical_decoding.pl3. +# Attempting to copy an invalidated slot +($result, $stdout, $stderr) = $node_standby->psql(/Attempting/Attempt/
Fixed
======
[1] https://www.postgresql.org/docs/current/functions-admin.html
I have updated the changes in v2 patch [1]/messages/by-id/CANhcyEVJpb6+hnk4MPVU3hZBYL=DS4v-PYBZOUoiivrN8Vd_Bw@mail.gmail.com.
[1]: /messages/by-id/CANhcyEVJpb6+hnk4MPVU3hZBYL=DS4v-PYBZOUoiivrN8Vd_Bw@mail.gmail.com
Thanks and Regards,
Shlok Kyal
On Monday, February 17, 2025 7:31 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Thu, 13 Feb 2025 at 15:54, vignesh C <vignesh21@gmail.com> wrote:
On Tue, 4 Feb 2025 at 15:27, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
Hi,
Currently, we can copy an invalidated slot using the function
'pg_copy_logical_replication_slot'. As per the suggestion in the
thread [1], we should prohibit copying of such slots.I have created a patch to address the issue.
This patch does not fix all the copy_replication_slot scenarios completely, there is a very corner concurrency case where an invalidated slot still gets copied: + /* We should not copy invalidated replication slots */ + if (src_isinvalidated) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot copy an invalidated replication slot")));Consider the following scenario:
step 1) Set up streaming replication between the primary and standby nodes.
step 2) Create a logical replication slot (test1) on the standby node.
step 3) Have a breakpoint in InvalidatePossiblyObsoleteSlot if cause
is RS_INVAL_WAL_LEVEL, no need to hold other invalidation causes or
add a sleep in InvalidatePossiblyObsoleteSlot function like below:
if (cause == RS_INVAL_WAL_LEVEL)
{
while (bsleep)
sleep(1);
}
step 4) Reduce wal_level on the primary to replica and restart the primarynode.
step 5) SELECT 'copy' FROM pg_copy_logical_replication_slot('test1',
'test2'); -- It will wait till the lock held by
InvalidatePossiblyObsoleteSlot is released while trying to create a
slot.
step 6) Increase wal_level back to logical on the primary node and
restart the primary.
step 7) Now allow the invalidation to happen (continue the breakpoint
held at step 3), the replication control lock will be released and the
invalidated slot will be copiedAfter this:
postgres=# SELECT 'copy' FROM
pg_copy_logical_replication_slot('test1', 'test2'); ?column?
----------
copy
(1 row)-- The invalidated slot (test1) is copied successfully:
postgres=# select * from pg_replication_slots ;
slot_name | plugin | slot_type | datoid | database | temporary
| active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phas
e | inactive_since | conflicting |
invalidation_reason | failover | synced-----------+---------------+-----------+--------+----------+-----------+
--------+------------+------+--------------+-------------+---------------
------+------------+---------------+-----------+----------------------------------+-------------+----------------------
--+----------+--------test1 | test_decoding | logical | 5 | postgres | f
| f | | | 745 | 0/4029060 | 0/4029098
| lost | | f
| 2025-02-13 15:26:54.666725+05:30 | t |
wal_level_insufficient | f | f
test2 | test_decoding | logical | 5 | postgres | f
| f | | | 745 | 0/4029060 | 0/4029098
| reserved | | f
| 2025-02-13 15:30:30.477836+05:30 | f |
| f | f
(2 rows)-- A subsequent attempt to decode changes from the invalidated slot
(test2) fails:
postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL,
NULL);
WARNING: detected write past chunk end in TXN 0x5e77e6c6f300
ERROR: logical decoding on standby requires "wal_level" >= "logical"
on the primary-- Alternatively, the following error may occur:
postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL,
NULL);
WARNING: detected write past chunk end in TXN 0x582d1b2d6ef0
data
------------
BEGIN 744
COMMIT 744
(2 rows)This is an edge case that can occur under specific conditions
involving replication slot invalidation when there is a huge lag
between primary and standby.
There might be a similar concurrency case for wal_removed too.Hi Vignesh,
Thanks for reviewing the patch.
Thanks for updating the patch. I have a question related to it.
I have tested the above scenario and was able to reproduce it. I have fixed it in
the v2 patch.
Currently we are taking a shared lock on ReplicationSlotControlLock.
This issue can be resolved if we take an exclusive lock instead.
Thoughts?
It's not clear to me why increasing the lock level can solve it, could you
elaborate a bit more on this ?
Besides, do we need one more invalidated check in the following codes after
creating the slot ?
/*
* Check if the source slot still exists and is valid. We regard it as
* invalid if the type of replication slot or name has been changed,
* or the restart_lsn either is invalid or has gone backward. (The
...
Best Regards,
Hou zj
On Tue, 18 Feb 2025 at 15:26, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
On Monday, February 17, 2025 7:31 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Thu, 13 Feb 2025 at 15:54, vignesh C <vignesh21@gmail.com> wrote:
On Tue, 4 Feb 2025 at 15:27, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
Hi,
Currently, we can copy an invalidated slot using the function
'pg_copy_logical_replication_slot'. As per the suggestion in the
thread [1], we should prohibit copying of such slots.I have created a patch to address the issue.
This patch does not fix all the copy_replication_slot scenarios completely, there is a very corner concurrency case where an invalidated slot still gets copied: + /* We should not copy invalidated replication slots */ + if (src_isinvalidated) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot copy an invalidated replication slot")));Consider the following scenario:
step 1) Set up streaming replication between the primary and standby nodes.
step 2) Create a logical replication slot (test1) on the standby node.
step 3) Have a breakpoint in InvalidatePossiblyObsoleteSlot if cause
is RS_INVAL_WAL_LEVEL, no need to hold other invalidation causes or
add a sleep in InvalidatePossiblyObsoleteSlot function like below:
if (cause == RS_INVAL_WAL_LEVEL)
{
while (bsleep)
sleep(1);
}
step 4) Reduce wal_level on the primary to replica and restart the primarynode.
step 5) SELECT 'copy' FROM pg_copy_logical_replication_slot('test1',
'test2'); -- It will wait till the lock held by
InvalidatePossiblyObsoleteSlot is released while trying to create a
slot.
step 6) Increase wal_level back to logical on the primary node and
restart the primary.
step 7) Now allow the invalidation to happen (continue the breakpoint
held at step 3), the replication control lock will be released and the
invalidated slot will be copiedAfter this:
postgres=# SELECT 'copy' FROM
pg_copy_logical_replication_slot('test1', 'test2'); ?column?
----------
copy
(1 row)-- The invalidated slot (test1) is copied successfully:
postgres=# select * from pg_replication_slots ;
slot_name | plugin | slot_type | datoid | database | temporary
| active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phas
e | inactive_since | conflicting |
invalidation_reason | failover | synced-----------+---------------+-----------+--------+----------+-----------+
--------+------------+------+--------------+-------------+---------------
------+------------+---------------+-----------+----------------------------------+-------------+----------------------
--+----------+--------test1 | test_decoding | logical | 5 | postgres | f
| f | | | 745 | 0/4029060 | 0/4029098
| lost | | f
| 2025-02-13 15:26:54.666725+05:30 | t |
wal_level_insufficient | f | f
test2 | test_decoding | logical | 5 | postgres | f
| f | | | 745 | 0/4029060 | 0/4029098
| reserved | | f
| 2025-02-13 15:30:30.477836+05:30 | f |
| f | f
(2 rows)-- A subsequent attempt to decode changes from the invalidated slot
(test2) fails:
postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL,
NULL);
WARNING: detected write past chunk end in TXN 0x5e77e6c6f300
ERROR: logical decoding on standby requires "wal_level" >= "logical"
on the primary-- Alternatively, the following error may occur:
postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL,
NULL);
WARNING: detected write past chunk end in TXN 0x582d1b2d6ef0
data
------------
BEGIN 744
COMMIT 744
(2 rows)This is an edge case that can occur under specific conditions
involving replication slot invalidation when there is a huge lag
between primary and standby.
There might be a similar concurrency case for wal_removed too.Hi Vignesh,
Thanks for reviewing the patch.
Thanks for updating the patch. I have a question related to it.
I have tested the above scenario and was able to reproduce it. I have fixed it in
the v2 patch.
Currently we are taking a shared lock on ReplicationSlotControlLock.
This issue can be resolved if we take an exclusive lock instead.
Thoughts?It's not clear to me why increasing the lock level can solve it, could you
elaborate a bit more on this ?
In HEAD, InvalidateObsoleteReplicationSlots acquires a SHARED lock on
'ReplicationSlotControlLock'
Also in function 'copy_replication_slot' we take a SHARED lock on
'ReplicationSlotControlLock' during fetching of source slot.
So, for the case described by Vignesh in [1]/messages/by-id/CALDaNm2rrxO5mg6OKoScw84K5P1Tw_cbjniHm+Geyxme8Ei-nQ@mail.gmail.com, first
InvalidateObsoleteReplicationSlot is called and we hold a SHARED lock
on 'ReplicationSlotControlLock'. We are now holding the function using
the sleep
if (cause == RS_INVAL_WAL_LEVEL)
{
while (bsleep)
sleep(1);
}
Now we create a copy of the slot since 'copy_replication_slot' takes
a SHARED lock on 'ReplicationSlotControlLock'. It will take the lock
and fetch the info of the source slot (the slot is not invalidated
till now). and the function 'copy_replication_slot' calls function
'create_logical_replication_slot' which takes a EXCLUSIVE lock on
ReplicationSlotControlLock and hence it will wait for function
InvalidateObsoleteReplicationSlot to release lock. Once the function
'InvalidateObsoleteReplicationSlot' releases the lock, the execution
of 'create_logical_replication_slot' continues and creates a copy of
the source slot.
Now with the patch, 'copy_replication_slot' will take an EXCLUSIVE
lock on 'ReplicationSlotControlLock'. to fetch the slot info. Hence,
it will wait for the 'InvalidateObsoleteReplicationSlot' to release
the lock and then fetch the source slot info and try to create the
copied slot (which will fail as source slot is invalidated before we
fetch its info)
Besides, do we need one more invalidated check in the following codes after
creating the slot ?/*
* Check if the source slot still exists and is valid. We regard it as
* invalid if the type of replication slot or name has been changed,
* or the restart_lsn either is invalid or has gone backward. (The
...
This approach seems more feasible to me. It also resolves the issue
suggested by Vignesh in [1]/messages/by-id/CALDaNm2rrxO5mg6OKoScw84K5P1Tw_cbjniHm+Geyxme8Ei-nQ@mail.gmail.com. I have made changes for the same in v3
patch.
[1]: /messages/by-id/CALDaNm2rrxO5mg6OKoScw84K5P1Tw_cbjniHm+Geyxme8Ei-nQ@mail.gmail.com
Thanks and Regards,
Shlok Kyal
Attachments:
v3-0001-Restrict-copying-of-invalidated-replication-slots.patchapplication/octet-stream; name=v3-0001-Restrict-copying-of-invalidated-replication-slots.patchDownload
From 629a0eab98e5f040f7b4b3f6d16a4b0ece8d0063 Mon Sep 17 00:00:00 2001
From: Shlok Kyal <shlok.kyal.oss@gmail.com>
Date: Tue, 4 Feb 2025 14:58:32 +0530
Subject: [PATCH v3] Restrict copying of invalidated replication slots
Currently we can copy an invalidated logical and physical replication slot
using functions 'pg_copy_logical_replication_slot' and
'pg_copy_physical_replication_slot' respectively.
With this patch we will throw an error in such cases.
---
doc/src/sgml/func.sgml | 4 +-
src/backend/replication/slotfuncs.c | 37 +++++++++++++++++++
.../t/035_standby_logical_decoding.pl | 9 +++++
3 files changed, 49 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index df32ee0bf5b..1bed0103723 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29347,7 +29347,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
The copied physical slot starts to reserve WAL from the same <acronym>LSN</acronym> as the
source slot.
<parameter>temporary</parameter> is optional. If <parameter>temporary</parameter>
- is omitted, the same value as the source slot is used.
+ is omitted, the same value as the source slot is used. Copy of an
+ invalidated physical replication slot in not allowed.
</para></entry>
</row>
@@ -29369,6 +29370,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
from the same <acronym>LSN</acronym> as the source logical slot. Both
<parameter>temporary</parameter> and <parameter>plugin</parameter> are
optional; if they are omitted, the values of the source slot are used.
+ Copy of an invalidated logical replication slot in not allowed.
</para></entry>
</row>
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index f652ec8a73e..c845e16f80a 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -604,6 +604,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
ReplicationSlot second_slot_contents;
XLogRecPtr src_restart_lsn;
bool src_islogical;
+ bool src_isinvalidated;
bool temporary;
char *plugin;
Datum values[2];
@@ -661,6 +662,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
src_restart_lsn = first_slot_contents.data.restart_lsn;
temporary = (first_slot_contents.data.persistency == RS_TEMPORARY);
plugin = logical_slot ? NameStr(first_slot_contents.data.plugin) : NULL;
+ src_isinvalidated = (first_slot_contents.data.invalidated != RS_INVAL_NONE);
/* Check type of replication slot */
if (src_islogical != logical_slot)
@@ -678,6 +680,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot copy a replication slot that doesn't reserve WAL")));
+ /* Cannot copy an invalidated replication slot */
+ if (src_isinvalidated)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot copy invalidated replication slot \"%s\"",
+ NameStr(*src_name)));
+
/* Overwrite params from optional arguments */
if (PG_NARGS() >= 3)
temporary = PG_GETARG_BOOL(2);
@@ -774,6 +783,34 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
NameStr(*src_name)),
errhint("Retry when the source replication slot's confirmed_flush_lsn is valid.")));
+ /* Check if source slot was invalidated while copying of slot */
+ LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+ for (int i = 0; i < max_replication_slots; i++)
+ {
+ ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+ if (s->in_use && strcmp(NameStr(s->data.name), NameStr(*src_name)) == 0)
+ {
+ /* Copy the slot contents while holding spinlock */
+ SpinLockAcquire(&s->mutex);
+ first_slot_contents = *s;
+ SpinLockRelease(&s->mutex);
+ src = s;
+ break;
+ }
+ }
+
+ LWLockRelease(ReplicationSlotControlLock);
+
+ src_isinvalidated = (first_slot_contents.data.invalidated != RS_INVAL_NONE);
+
+ if (src_isinvalidated)
+ ereport(ERROR,
+ (errmsg("could not copy replication slot \"%s\"",
+ NameStr(*src_name)),
+ errdetail("The source replication slot was invalidated during the copy operation.")));
+
/* Install copied values again */
SpinLockAcquire(&MyReplicationSlot->mutex);
MyReplicationSlot->effective_xmin = copy_effective_xmin;
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 505e85d1eb6..23b235412ac 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -553,6 +553,15 @@ $handle =
check_pg_recvlogical_stderr($handle,
"can no longer access replication slot \"vacuum_full_activeslot\"");
+# Attempt to copy an invalidated logical replication slot
+($result, $stdout, $stderr) = $node_standby->psql(
+ 'postgres',
+ qq[select pg_copy_logical_replication_slot('vacuum_full_inactiveslot', 'vacuum_full_inactiveslot_copy');],
+ replication => 'database');
+ok( $stderr =~
+ /ERROR: cannot copy invalidated replication slot "vacuum_full_inactiveslot"/,
+ "invalidated slot cannot be copied");
+
# Turn hot_standby_feedback back on
change_hot_standby_feedback_and_wait_for_xmins(1, 1);
--
2.34.1
On Wed, Feb 19, 2025 at 3:46 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Tue, 18 Feb 2025 at 15:26, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:On Monday, February 17, 2025 7:31 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Thu, 13 Feb 2025 at 15:54, vignesh C <vignesh21@gmail.com> wrote:
On Tue, 4 Feb 2025 at 15:27, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
Hi,
Currently, we can copy an invalidated slot using the function
'pg_copy_logical_replication_slot'. As per the suggestion in the
thread [1], we should prohibit copying of such slots.I have created a patch to address the issue.
This patch does not fix all the copy_replication_slot scenarios completely, there is a very corner concurrency case where an invalidated slot still gets copied: + /* We should not copy invalidated replication slots */ + if (src_isinvalidated) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot copy an invalidated replication slot")));Consider the following scenario:
step 1) Set up streaming replication between the primary and standby nodes.
step 2) Create a logical replication slot (test1) on the standby node.
step 3) Have a breakpoint in InvalidatePossiblyObsoleteSlot if cause
is RS_INVAL_WAL_LEVEL, no need to hold other invalidation causes or
add a sleep in InvalidatePossiblyObsoleteSlot function like below:
if (cause == RS_INVAL_WAL_LEVEL)
{
while (bsleep)
sleep(1);
}
step 4) Reduce wal_level on the primary to replica and restart the primarynode.
step 5) SELECT 'copy' FROM pg_copy_logical_replication_slot('test1',
'test2'); -- It will wait till the lock held by
InvalidatePossiblyObsoleteSlot is released while trying to create a
slot.
step 6) Increase wal_level back to logical on the primary node and
restart the primary.
step 7) Now allow the invalidation to happen (continue the breakpoint
held at step 3), the replication control lock will be released and the
invalidated slot will be copiedAfter this:
postgres=# SELECT 'copy' FROM
pg_copy_logical_replication_slot('test1', 'test2'); ?column?
----------
copy
(1 row)-- The invalidated slot (test1) is copied successfully:
postgres=# select * from pg_replication_slots ;
slot_name | plugin | slot_type | datoid | database | temporary
| active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phas
e | inactive_since | conflicting |
invalidation_reason | failover | synced-----------+---------------+-----------+--------+----------+-----------+
--------+------------+------+--------------+-------------+---------------
------+------------+---------------+-----------+----------------------------------+-------------+----------------------
--+----------+--------test1 | test_decoding | logical | 5 | postgres | f
| f | | | 745 | 0/4029060 | 0/4029098
| lost | | f
| 2025-02-13 15:26:54.666725+05:30 | t |
wal_level_insufficient | f | f
test2 | test_decoding | logical | 5 | postgres | f
| f | | | 745 | 0/4029060 | 0/4029098
| reserved | | f
| 2025-02-13 15:30:30.477836+05:30 | f |
| f | f
(2 rows)-- A subsequent attempt to decode changes from the invalidated slot
(test2) fails:
postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL,
NULL);
WARNING: detected write past chunk end in TXN 0x5e77e6c6f300
ERROR: logical decoding on standby requires "wal_level" >= "logical"
on the primary-- Alternatively, the following error may occur:
postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL,
NULL);
WARNING: detected write past chunk end in TXN 0x582d1b2d6ef0
data
------------
BEGIN 744
COMMIT 744
(2 rows)This is an edge case that can occur under specific conditions
involving replication slot invalidation when there is a huge lag
between primary and standby.
There might be a similar concurrency case for wal_removed too.Hi Vignesh,
Thanks for reviewing the patch.
Thanks for updating the patch. I have a question related to it.
I have tested the above scenario and was able to reproduce it. I have fixed it in
the v2 patch.
Currently we are taking a shared lock on ReplicationSlotControlLock.
This issue can be resolved if we take an exclusive lock instead.
Thoughts?It's not clear to me why increasing the lock level can solve it, could you
elaborate a bit more on this ?In HEAD, InvalidateObsoleteReplicationSlots acquires a SHARED lock on
'ReplicationSlotControlLock'
Also in function 'copy_replication_slot' we take a SHARED lock on
'ReplicationSlotControlLock' during fetching of source slot.So, for the case described by Vignesh in [1], first
InvalidateObsoleteReplicationSlot is called and we hold a SHARED lock
on 'ReplicationSlotControlLock'. We are now holding the function using
the sleep
if (cause == RS_INVAL_WAL_LEVEL)
{
while (bsleep)
sleep(1);
}Now we create a copy of the slot since 'copy_replication_slot' takes
a SHARED lock on 'ReplicationSlotControlLock'. It will take the lock
and fetch the info of the source slot (the slot is not invalidated
till now). and the function 'copy_replication_slot' calls function
'create_logical_replication_slot' which takes a EXCLUSIVE lock on
ReplicationSlotControlLock and hence it will wait for function
InvalidateObsoleteReplicationSlot to release lock. Once the function
'InvalidateObsoleteReplicationSlot' releases the lock, the execution
of 'create_logical_replication_slot' continues and creates a copy of
the source slot.Now with the patch, 'copy_replication_slot' will take an EXCLUSIVE
lock on 'ReplicationSlotControlLock'. to fetch the slot info. Hence,
it will wait for the 'InvalidateObsoleteReplicationSlot' to release
the lock and then fetch the source slot info and try to create the
copied slot (which will fail as source slot is invalidated before we
fetch its info)Besides, do we need one more invalidated check in the following codes after
creating the slot ?/*
* Check if the source slot still exists and is valid. We regard it as
* invalid if the type of replication slot or name has been changed,
* or the restart_lsn either is invalid or has gone backward. (The
...This approach seems more feasible to me. It also resolves the issue
suggested by Vignesh in [1]. I have made changes for the same in v3
patch.
I agree to check if the source slot got invalidated during the copy.
But why do we need to search the slot by the slot name again as
follows?
+ /* Check if source slot was invalidated while copying of slot */
+ LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+ for (int i = 0; i < max_replication_slots; i++)
+ {
+ ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+ if (s->in_use && strcmp(NameStr(s->data.name),
NameStr(*src_name)) == 0)
+ {
+ /* Copy the slot contents while holding spinlock */
+ SpinLockAcquire(&s->mutex);
+ first_slot_contents = *s;
+ SpinLockRelease(&s->mutex);
+ src = s;
+ break;
+ }
+ }
+
+ LWLockRelease(ReplicationSlotControlLock);
I think 'src' already points to the source slot.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Fri, 21 Feb 2025 at 01:14, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Feb 19, 2025 at 3:46 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Tue, 18 Feb 2025 at 15:26, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:On Monday, February 17, 2025 7:31 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Thu, 13 Feb 2025 at 15:54, vignesh C <vignesh21@gmail.com> wrote:
On Tue, 4 Feb 2025 at 15:27, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
Hi,
Currently, we can copy an invalidated slot using the function
'pg_copy_logical_replication_slot'. As per the suggestion in the
thread [1], we should prohibit copying of such slots.I have created a patch to address the issue.
This patch does not fix all the copy_replication_slot scenarios completely, there is a very corner concurrency case where an invalidated slot still gets copied: + /* We should not copy invalidated replication slots */ + if (src_isinvalidated) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot copy an invalidated replication slot")));Consider the following scenario:
step 1) Set up streaming replication between the primary and standby nodes.
step 2) Create a logical replication slot (test1) on the standby node.
step 3) Have a breakpoint in InvalidatePossiblyObsoleteSlot if cause
is RS_INVAL_WAL_LEVEL, no need to hold other invalidation causes or
add a sleep in InvalidatePossiblyObsoleteSlot function like below:
if (cause == RS_INVAL_WAL_LEVEL)
{
while (bsleep)
sleep(1);
}
step 4) Reduce wal_level on the primary to replica and restart the primarynode.
step 5) SELECT 'copy' FROM pg_copy_logical_replication_slot('test1',
'test2'); -- It will wait till the lock held by
InvalidatePossiblyObsoleteSlot is released while trying to create a
slot.
step 6) Increase wal_level back to logical on the primary node and
restart the primary.
step 7) Now allow the invalidation to happen (continue the breakpoint
held at step 3), the replication control lock will be released and the
invalidated slot will be copiedAfter this:
postgres=# SELECT 'copy' FROM
pg_copy_logical_replication_slot('test1', 'test2'); ?column?
----------
copy
(1 row)-- The invalidated slot (test1) is copied successfully:
postgres=# select * from pg_replication_slots ;
slot_name | plugin | slot_type | datoid | database | temporary
| active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phas
e | inactive_since | conflicting |
invalidation_reason | failover | synced-----------+---------------+-----------+--------+----------+-----------+
--------+------------+------+--------------+-------------+---------------
------+------------+---------------+-----------+----------------------------------+-------------+----------------------
--+----------+--------test1 | test_decoding | logical | 5 | postgres | f
| f | | | 745 | 0/4029060 | 0/4029098
| lost | | f
| 2025-02-13 15:26:54.666725+05:30 | t |
wal_level_insufficient | f | f
test2 | test_decoding | logical | 5 | postgres | f
| f | | | 745 | 0/4029060 | 0/4029098
| reserved | | f
| 2025-02-13 15:30:30.477836+05:30 | f |
| f | f
(2 rows)-- A subsequent attempt to decode changes from the invalidated slot
(test2) fails:
postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL,
NULL);
WARNING: detected write past chunk end in TXN 0x5e77e6c6f300
ERROR: logical decoding on standby requires "wal_level" >= "logical"
on the primary-- Alternatively, the following error may occur:
postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL,
NULL);
WARNING: detected write past chunk end in TXN 0x582d1b2d6ef0
data
------------
BEGIN 744
COMMIT 744
(2 rows)This is an edge case that can occur under specific conditions
involving replication slot invalidation when there is a huge lag
between primary and standby.
There might be a similar concurrency case for wal_removed too.Hi Vignesh,
Thanks for reviewing the patch.
Thanks for updating the patch. I have a question related to it.
I have tested the above scenario and was able to reproduce it. I have fixed it in
the v2 patch.
Currently we are taking a shared lock on ReplicationSlotControlLock.
This issue can be resolved if we take an exclusive lock instead.
Thoughts?It's not clear to me why increasing the lock level can solve it, could you
elaborate a bit more on this ?In HEAD, InvalidateObsoleteReplicationSlots acquires a SHARED lock on
'ReplicationSlotControlLock'
Also in function 'copy_replication_slot' we take a SHARED lock on
'ReplicationSlotControlLock' during fetching of source slot.So, for the case described by Vignesh in [1], first
InvalidateObsoleteReplicationSlot is called and we hold a SHARED lock
on 'ReplicationSlotControlLock'. We are now holding the function using
the sleep
if (cause == RS_INVAL_WAL_LEVEL)
{
while (bsleep)
sleep(1);
}Now we create a copy of the slot since 'copy_replication_slot' takes
a SHARED lock on 'ReplicationSlotControlLock'. It will take the lock
and fetch the info of the source slot (the slot is not invalidated
till now). and the function 'copy_replication_slot' calls function
'create_logical_replication_slot' which takes a EXCLUSIVE lock on
ReplicationSlotControlLock and hence it will wait for function
InvalidateObsoleteReplicationSlot to release lock. Once the function
'InvalidateObsoleteReplicationSlot' releases the lock, the execution
of 'create_logical_replication_slot' continues and creates a copy of
the source slot.Now with the patch, 'copy_replication_slot' will take an EXCLUSIVE
lock on 'ReplicationSlotControlLock'. to fetch the slot info. Hence,
it will wait for the 'InvalidateObsoleteReplicationSlot' to release
the lock and then fetch the source slot info and try to create the
copied slot (which will fail as source slot is invalidated before we
fetch its info)Besides, do we need one more invalidated check in the following codes after
creating the slot ?/*
* Check if the source slot still exists and is valid. We regard it as
* invalid if the type of replication slot or name has been changed,
* or the restart_lsn either is invalid or has gone backward. (The
...This approach seems more feasible to me. It also resolves the issue
suggested by Vignesh in [1]. I have made changes for the same in v3
patch.I agree to check if the source slot got invalidated during the copy.
But why do we need to search the slot by the slot name again as
follows?+ /* Check if source slot was invalidated while copying of slot */ + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); + + for (int i = 0; i < max_replication_slots; i++) + { + ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i]; + + if (s->in_use && strcmp(NameStr(s->data.name), NameStr(*src_name)) == 0) + { + /* Copy the slot contents while holding spinlock */ + SpinLockAcquire(&s->mutex); + first_slot_contents = *s; + SpinLockRelease(&s->mutex); + src = s; + break; + } + } + + LWLockRelease(ReplicationSlotControlLock);I think 'src' already points to the source slot.
Hi Sawada san,
Thanks for reviewing the patch.
I have used the 'src' instead of iterating again. I have attached the
updated v4 patch.
Thanks and Regards,
Shlok Kyal
Attachments:
v4-0001-Restrict-copying-of-invalidated-replication-slots.patchapplication/octet-stream; name=v4-0001-Restrict-copying-of-invalidated-replication-slots.patchDownload
From a5e1d31ed2fdb30ac8c87c7dad9da935e9df07d6 Mon Sep 17 00:00:00 2001
From: Shlok Kyal <shlok.kyal.oss@gmail.com>
Date: Tue, 4 Feb 2025 14:58:32 +0530
Subject: [PATCH v4] Restrict copying of invalidated replication slots
Currently we can copy an invalidated logical and physical replication slot
using functions 'pg_copy_logical_replication_slot' and
'pg_copy_physical_replication_slot' respectively.
With this patch we will throw an error in such cases.
---
doc/src/sgml/func.sgml | 4 +++-
src/backend/replication/slotfuncs.c | 22 +++++++++++++++++++
.../t/035_standby_logical_decoding.pl | 9 ++++++++
3 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index df32ee0bf5b..1bed0103723 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29347,7 +29347,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
The copied physical slot starts to reserve WAL from the same <acronym>LSN</acronym> as the
source slot.
<parameter>temporary</parameter> is optional. If <parameter>temporary</parameter>
- is omitted, the same value as the source slot is used.
+ is omitted, the same value as the source slot is used. Copy of an
+ invalidated physical replication slot in not allowed.
</para></entry>
</row>
@@ -29369,6 +29370,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
from the same <acronym>LSN</acronym> as the source logical slot. Both
<parameter>temporary</parameter> and <parameter>plugin</parameter> are
optional; if they are omitted, the values of the source slot are used.
+ Copy of an invalidated logical replication slot in not allowed.
</para></entry>
</row>
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index f652ec8a73e..7669604cbbb 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -604,6 +604,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
ReplicationSlot second_slot_contents;
XLogRecPtr src_restart_lsn;
bool src_islogical;
+ bool src_isinvalidated;
bool temporary;
char *plugin;
Datum values[2];
@@ -661,6 +662,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
src_restart_lsn = first_slot_contents.data.restart_lsn;
temporary = (first_slot_contents.data.persistency == RS_TEMPORARY);
plugin = logical_slot ? NameStr(first_slot_contents.data.plugin) : NULL;
+ src_isinvalidated = (first_slot_contents.data.invalidated != RS_INVAL_NONE);
/* Check type of replication slot */
if (src_islogical != logical_slot)
@@ -678,6 +680,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot copy a replication slot that doesn't reserve WAL")));
+ /* Cannot copy an invalidated replication slot */
+ if (src_isinvalidated)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot copy invalidated replication slot \"%s\"",
+ NameStr(*src_name)));
+
/* Overwrite params from optional arguments */
if (PG_NARGS() >= 3)
temporary = PG_GETARG_BOOL(2);
@@ -774,6 +783,19 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
NameStr(*src_name)),
errhint("Retry when the source replication slot's confirmed_flush_lsn is valid.")));
+ /* Check if source slot was invalidated while copying of slot */
+ SpinLockAcquire(&src->mutex);
+ first_slot_contents = *src;
+ SpinLockRelease(&src->mutex);
+
+ src_isinvalidated = (first_slot_contents.data.invalidated != RS_INVAL_NONE);
+
+ if (src_isinvalidated)
+ ereport(ERROR,
+ (errmsg("could not copy replication slot \"%s\"",
+ NameStr(*src_name)),
+ errdetail("The source replication slot was invalidated during the copy operation.")));
+
/* Install copied values again */
SpinLockAcquire(&MyReplicationSlot->mutex);
MyReplicationSlot->effective_xmin = copy_effective_xmin;
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 505e85d1eb6..23b235412ac 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -553,6 +553,15 @@ $handle =
check_pg_recvlogical_stderr($handle,
"can no longer access replication slot \"vacuum_full_activeslot\"");
+# Attempt to copy an invalidated logical replication slot
+($result, $stdout, $stderr) = $node_standby->psql(
+ 'postgres',
+ qq[select pg_copy_logical_replication_slot('vacuum_full_inactiveslot', 'vacuum_full_inactiveslot_copy');],
+ replication => 'database');
+ok( $stderr =~
+ /ERROR: cannot copy invalidated replication slot "vacuum_full_inactiveslot"/,
+ "invalidated slot cannot be copied");
+
# Turn hot_standby_feedback back on
change_hot_standby_feedback_and_wait_for_xmins(1, 1);
--
2.34.1
On Fri, Feb 21, 2025 at 4:30 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Fri, 21 Feb 2025 at 01:14, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Feb 19, 2025 at 3:46 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Tue, 18 Feb 2025 at 15:26, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:On Monday, February 17, 2025 7:31 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Thu, 13 Feb 2025 at 15:54, vignesh C <vignesh21@gmail.com> wrote:
On Tue, 4 Feb 2025 at 15:27, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
Hi,
Currently, we can copy an invalidated slot using the function
'pg_copy_logical_replication_slot'. As per the suggestion in the
thread [1], we should prohibit copying of such slots.I have created a patch to address the issue.
This patch does not fix all the copy_replication_slot scenarios completely, there is a very corner concurrency case where an invalidated slot still gets copied: + /* We should not copy invalidated replication slots */ + if (src_isinvalidated) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot copy an invalidated replication slot")));Consider the following scenario:
step 1) Set up streaming replication between the primary and standby nodes.
step 2) Create a logical replication slot (test1) on the standby node.
step 3) Have a breakpoint in InvalidatePossiblyObsoleteSlot if cause
is RS_INVAL_WAL_LEVEL, no need to hold other invalidation causes or
add a sleep in InvalidatePossiblyObsoleteSlot function like below:
if (cause == RS_INVAL_WAL_LEVEL)
{
while (bsleep)
sleep(1);
}
step 4) Reduce wal_level on the primary to replica and restart the primarynode.
step 5) SELECT 'copy' FROM pg_copy_logical_replication_slot('test1',
'test2'); -- It will wait till the lock held by
InvalidatePossiblyObsoleteSlot is released while trying to create a
slot.
step 6) Increase wal_level back to logical on the primary node and
restart the primary.
step 7) Now allow the invalidation to happen (continue the breakpoint
held at step 3), the replication control lock will be released and the
invalidated slot will be copiedAfter this:
postgres=# SELECT 'copy' FROM
pg_copy_logical_replication_slot('test1', 'test2'); ?column?
----------
copy
(1 row)-- The invalidated slot (test1) is copied successfully:
postgres=# select * from pg_replication_slots ;
slot_name | plugin | slot_type | datoid | database | temporary
| active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phas
e | inactive_since | conflicting |
invalidation_reason | failover | synced-----------+---------------+-----------+--------+----------+-----------+
--------+------------+------+--------------+-------------+---------------
------+------------+---------------+-----------+----------------------------------+-------------+----------------------
--+----------+--------test1 | test_decoding | logical | 5 | postgres | f
| f | | | 745 | 0/4029060 | 0/4029098
| lost | | f
| 2025-02-13 15:26:54.666725+05:30 | t |
wal_level_insufficient | f | f
test2 | test_decoding | logical | 5 | postgres | f
| f | | | 745 | 0/4029060 | 0/4029098
| reserved | | f
| 2025-02-13 15:30:30.477836+05:30 | f |
| f | f
(2 rows)-- A subsequent attempt to decode changes from the invalidated slot
(test2) fails:
postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL,
NULL);
WARNING: detected write past chunk end in TXN 0x5e77e6c6f300
ERROR: logical decoding on standby requires "wal_level" >= "logical"
on the primary-- Alternatively, the following error may occur:
postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL,
NULL);
WARNING: detected write past chunk end in TXN 0x582d1b2d6ef0
data
------------
BEGIN 744
COMMIT 744
(2 rows)This is an edge case that can occur under specific conditions
involving replication slot invalidation when there is a huge lag
between primary and standby.
There might be a similar concurrency case for wal_removed too.Hi Vignesh,
Thanks for reviewing the patch.
Thanks for updating the patch. I have a question related to it.
I have tested the above scenario and was able to reproduce it. I have fixed it in
the v2 patch.
Currently we are taking a shared lock on ReplicationSlotControlLock.
This issue can be resolved if we take an exclusive lock instead.
Thoughts?It's not clear to me why increasing the lock level can solve it, could you
elaborate a bit more on this ?In HEAD, InvalidateObsoleteReplicationSlots acquires a SHARED lock on
'ReplicationSlotControlLock'
Also in function 'copy_replication_slot' we take a SHARED lock on
'ReplicationSlotControlLock' during fetching of source slot.So, for the case described by Vignesh in [1], first
InvalidateObsoleteReplicationSlot is called and we hold a SHARED lock
on 'ReplicationSlotControlLock'. We are now holding the function using
the sleep
if (cause == RS_INVAL_WAL_LEVEL)
{
while (bsleep)
sleep(1);
}Now we create a copy of the slot since 'copy_replication_slot' takes
a SHARED lock on 'ReplicationSlotControlLock'. It will take the lock
and fetch the info of the source slot (the slot is not invalidated
till now). and the function 'copy_replication_slot' calls function
'create_logical_replication_slot' which takes a EXCLUSIVE lock on
ReplicationSlotControlLock and hence it will wait for function
InvalidateObsoleteReplicationSlot to release lock. Once the function
'InvalidateObsoleteReplicationSlot' releases the lock, the execution
of 'create_logical_replication_slot' continues and creates a copy of
the source slot.Now with the patch, 'copy_replication_slot' will take an EXCLUSIVE
lock on 'ReplicationSlotControlLock'. to fetch the slot info. Hence,
it will wait for the 'InvalidateObsoleteReplicationSlot' to release
the lock and then fetch the source slot info and try to create the
copied slot (which will fail as source slot is invalidated before we
fetch its info)Besides, do we need one more invalidated check in the following codes after
creating the slot ?/*
* Check if the source slot still exists and is valid. We regard it as
* invalid if the type of replication slot or name has been changed,
* or the restart_lsn either is invalid or has gone backward. (The
...This approach seems more feasible to me. It also resolves the issue
suggested by Vignesh in [1]. I have made changes for the same in v3
patch.I agree to check if the source slot got invalidated during the copy.
But why do we need to search the slot by the slot name again as
follows?+ /* Check if source slot was invalidated while copying of slot */ + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); + + for (int i = 0; i < max_replication_slots; i++) + { + ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i]; + + if (s->in_use && strcmp(NameStr(s->data.name), NameStr(*src_name)) == 0) + { + /* Copy the slot contents while holding spinlock */ + SpinLockAcquire(&s->mutex); + first_slot_contents = *s; + SpinLockRelease(&s->mutex); + src = s; + break; + } + } + + LWLockRelease(ReplicationSlotControlLock);I think 'src' already points to the source slot.
Hi Sawada san,
Thanks for reviewing the patch.
I have used the 'src' instead of iterating again. I have attached the
updated v4 patch.
Thank you for updating the patch! I have one comment:
+ /* Check if source slot was invalidated while copying of slot */
+ SpinLockAcquire(&src->mutex);
+ first_slot_contents = *src;
+ SpinLockRelease(&src->mutex);
We don't need to copy the source slot contents again since we already
do as follows:
/* Copy data of source slot again */
SpinLockAcquire(&src->mutex);
second_slot_contents = *src;
SpinLockRelease(&src->mutex);
I think we can use second_slot_contents for that check.
I've investigated the slot invalidation and copying slots behaviors.
We cannot copy a slot if it doesn't reserve WAL, but IIUC the slot's
restart_lsn is not reset by slot invalidation due to other than
RS_INVAL_WAL_REMOVED. Therefore, it's possible that we copy a slot
invalidated by for example RS_INVAL_IDLE_TIMEOUT, and the copied
slot's restart_lsn might have already been removed, which ultimately
causes an assertion failure in ocpy_replication_slot():
#ifdef USE_ASSERT_CHECKING
/* Check that the restart_lsn is available */
{
XLogSegNo segno;
XLByteToSeg(copy_restart_lsn, segno, wal_segment_size);
Assert(XLogGetLastRemovedSegno() < segno);
}
#endif
I think this issue exists from v16 or later, I've not tested yet
though. If my understanding is right, this patch has to be
backpatched.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Some review comments for patch v2-0001.
======
Commit message
1.
Currently we can copy an invalidated logical and physical replication slot
using functions 'pg_copy_logical_replication_slot' and
'pg_copy_physical_replication_slot' respectively.
With this patch we will throw an error in such cases.
/we can copy an invalidated logical and physical replication slot/we
can copy invalidated logical and physical replication slots/
======
doc/src/sgml/func.sgml
pg_copy_physical_replication_slot:
2.
- is omitted, the same value as the
source slot is used.
+ is omitted, the same value as the source slot is used. Copy of an
+ invalidated physical replication slot in not allowed.
Typo /in/is/
Also, IMO you don't need to say "physical replication slot" because it
is clear from the function's name.
SUGGESTION
Copy of an invalidated slot is not allowed.
~~~
pg_copy_logical_replication_slot:
3.
+ Copy of an invalidated logical replication slot in not allowed.
Typo /in/is/
Also, IMO you don't need to say "logical replication slot" because it
is clear from the function's name.
SUGGESTION
Copy of an invalidated slot is not allowed.
======
src/backend/replication/slotfuncs.c
copy_replication_slot:
4.
+ /* Check if source slot was invalidated while copying of slot */
+ SpinLockAcquire(&src->mutex);
+ first_slot_contents = *src;
+ SpinLockRelease(&src->mutex);
+
+ src_isinvalidated = (first_slot_contents.data.invalidated != RS_INVAL_NONE);
+
+ if (src_isinvalidated)
+ ereport(ERROR,
+ (errmsg("could not copy replication slot \"%s\"",
+ NameStr(*src_name)),
+ errdetail("The source replication slot was invalidated during the
copy operation.")));
4a.
We already know that it was not invalid the FIRST time we looked at
it, so IMO we only need to confirm that the SECOND look gives the same
answer. IOW, I thought the code should be like below. (AFAICT
Sawada-san's review says the same as this).
Also, I think it is better to say "became invalidated" instead of "was
invalidated", to imply the state was known to be ok before the copy.
SUGGESTION
+ /* Check if source slot became invalidated during the copy operation */
+ if (second_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
~
4b.
Unnecessary parentheses in the ereport.
~
4c.
There seems some weird mix of tense "cannot copy" versus "could not
copy" already in this file. But, maybe at least you can be consistent
within the patch and always say "cannot".
======
Kind Regards,
Peter Smith.
Fujitsu Australia
On Sat, 22 Feb 2025 at 04:49, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Feb 21, 2025 at 4:30 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Fri, 21 Feb 2025 at 01:14, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Feb 19, 2025 at 3:46 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Tue, 18 Feb 2025 at 15:26, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:On Monday, February 17, 2025 7:31 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Thu, 13 Feb 2025 at 15:54, vignesh C <vignesh21@gmail.com> wrote:
On Tue, 4 Feb 2025 at 15:27, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
Hi,
Currently, we can copy an invalidated slot using the function
'pg_copy_logical_replication_slot'. As per the suggestion in the
thread [1], we should prohibit copying of such slots.I have created a patch to address the issue.
This patch does not fix all the copy_replication_slot scenarios completely, there is a very corner concurrency case where an invalidated slot still gets copied: + /* We should not copy invalidated replication slots */ + if (src_isinvalidated) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot copy an invalidated replication slot")));Consider the following scenario:
step 1) Set up streaming replication between the primary and standby nodes.
step 2) Create a logical replication slot (test1) on the standby node.
step 3) Have a breakpoint in InvalidatePossiblyObsoleteSlot if cause
is RS_INVAL_WAL_LEVEL, no need to hold other invalidation causes or
add a sleep in InvalidatePossiblyObsoleteSlot function like below:
if (cause == RS_INVAL_WAL_LEVEL)
{
while (bsleep)
sleep(1);
}
step 4) Reduce wal_level on the primary to replica and restart the primarynode.
step 5) SELECT 'copy' FROM pg_copy_logical_replication_slot('test1',
'test2'); -- It will wait till the lock held by
InvalidatePossiblyObsoleteSlot is released while trying to create a
slot.
step 6) Increase wal_level back to logical on the primary node and
restart the primary.
step 7) Now allow the invalidation to happen (continue the breakpoint
held at step 3), the replication control lock will be released and the
invalidated slot will be copiedAfter this:
postgres=# SELECT 'copy' FROM
pg_copy_logical_replication_slot('test1', 'test2'); ?column?
----------
copy
(1 row)-- The invalidated slot (test1) is copied successfully:
postgres=# select * from pg_replication_slots ;
slot_name | plugin | slot_type | datoid | database | temporary
| active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phas
e | inactive_since | conflicting |
invalidation_reason | failover | synced-----------+---------------+-----------+--------+----------+-----------+
--------+------------+------+--------------+-------------+---------------
------+------------+---------------+-----------+----------------------------------+-------------+----------------------
--+----------+--------test1 | test_decoding | logical | 5 | postgres | f
| f | | | 745 | 0/4029060 | 0/4029098
| lost | | f
| 2025-02-13 15:26:54.666725+05:30 | t |
wal_level_insufficient | f | f
test2 | test_decoding | logical | 5 | postgres | f
| f | | | 745 | 0/4029060 | 0/4029098
| reserved | | f
| 2025-02-13 15:30:30.477836+05:30 | f |
| f | f
(2 rows)-- A subsequent attempt to decode changes from the invalidated slot
(test2) fails:
postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL,
NULL);
WARNING: detected write past chunk end in TXN 0x5e77e6c6f300
ERROR: logical decoding on standby requires "wal_level" >= "logical"
on the primary-- Alternatively, the following error may occur:
postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL,
NULL);
WARNING: detected write past chunk end in TXN 0x582d1b2d6ef0
data
------------
BEGIN 744
COMMIT 744
(2 rows)This is an edge case that can occur under specific conditions
involving replication slot invalidation when there is a huge lag
between primary and standby.
There might be a similar concurrency case for wal_removed too.Hi Vignesh,
Thanks for reviewing the patch.
Thanks for updating the patch. I have a question related to it.
I have tested the above scenario and was able to reproduce it. I have fixed it in
the v2 patch.
Currently we are taking a shared lock on ReplicationSlotControlLock.
This issue can be resolved if we take an exclusive lock instead.
Thoughts?It's not clear to me why increasing the lock level can solve it, could you
elaborate a bit more on this ?In HEAD, InvalidateObsoleteReplicationSlots acquires a SHARED lock on
'ReplicationSlotControlLock'
Also in function 'copy_replication_slot' we take a SHARED lock on
'ReplicationSlotControlLock' during fetching of source slot.So, for the case described by Vignesh in [1], first
InvalidateObsoleteReplicationSlot is called and we hold a SHARED lock
on 'ReplicationSlotControlLock'. We are now holding the function using
the sleep
if (cause == RS_INVAL_WAL_LEVEL)
{
while (bsleep)
sleep(1);
}Now we create a copy of the slot since 'copy_replication_slot' takes
a SHARED lock on 'ReplicationSlotControlLock'. It will take the lock
and fetch the info of the source slot (the slot is not invalidated
till now). and the function 'copy_replication_slot' calls function
'create_logical_replication_slot' which takes a EXCLUSIVE lock on
ReplicationSlotControlLock and hence it will wait for function
InvalidateObsoleteReplicationSlot to release lock. Once the function
'InvalidateObsoleteReplicationSlot' releases the lock, the execution
of 'create_logical_replication_slot' continues and creates a copy of
the source slot.Now with the patch, 'copy_replication_slot' will take an EXCLUSIVE
lock on 'ReplicationSlotControlLock'. to fetch the slot info. Hence,
it will wait for the 'InvalidateObsoleteReplicationSlot' to release
the lock and then fetch the source slot info and try to create the
copied slot (which will fail as source slot is invalidated before we
fetch its info)Besides, do we need one more invalidated check in the following codes after
creating the slot ?/*
* Check if the source slot still exists and is valid. We regard it as
* invalid if the type of replication slot or name has been changed,
* or the restart_lsn either is invalid or has gone backward. (The
...This approach seems more feasible to me. It also resolves the issue
suggested by Vignesh in [1]. I have made changes for the same in v3
patch.I agree to check if the source slot got invalidated during the copy.
But why do we need to search the slot by the slot name again as
follows?+ /* Check if source slot was invalidated while copying of slot */ + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); + + for (int i = 0; i < max_replication_slots; i++) + { + ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i]; + + if (s->in_use && strcmp(NameStr(s->data.name), NameStr(*src_name)) == 0) + { + /* Copy the slot contents while holding spinlock */ + SpinLockAcquire(&s->mutex); + first_slot_contents = *s; + SpinLockRelease(&s->mutex); + src = s; + break; + } + } + + LWLockRelease(ReplicationSlotControlLock);I think 'src' already points to the source slot.
Hi Sawada san,
Thanks for reviewing the patch.
I have used the 'src' instead of iterating again. I have attached the
updated v4 patch.Thank you for updating the patch! I have one comment:
+ /* Check if source slot was invalidated while copying of slot */ + SpinLockAcquire(&src->mutex); + first_slot_contents = *src; + SpinLockRelease(&src->mutex);We don't need to copy the source slot contents again since we already
do as follows:/* Copy data of source slot again */
SpinLockAcquire(&src->mutex);
second_slot_contents = *src;
SpinLockRelease(&src->mutex);I think we can use second_slot_contents for that check.
I agree. I have updated the v5 patch to use second_slot_contents
I've investigated the slot invalidation and copying slots behaviors.
We cannot copy a slot if it doesn't reserve WAL, but IIUC the slot's
restart_lsn is not reset by slot invalidation due to other than
RS_INVAL_WAL_REMOVED. Therefore, it's possible that we copy a slot
invalidated by for example RS_INVAL_IDLE_TIMEOUT, and the copied
slot's restart_lsn might have already been removed, which ultimately
causes an assertion failure in ocpy_replication_slot():#ifdef USE_ASSERT_CHECKING
/* Check that the restart_lsn is available */
{
XLogSegNo segno;XLByteToSeg(copy_restart_lsn, segno, wal_segment_size);
Assert(XLogGetLastRemovedSegno() < segno);
}
#endifI think this issue exists from v16 or later, I've not tested yet
though. If my understanding is right, this patch has to be
backpatched.
I have tested the above in HEAD, PG 17 and PG 16 and found that we can
hit the above ASSERT condition in all three branches. With the
following steps:
1. create a physical replication setup
2. In standby create a logical replication slot.
3. change wal_level of primary to 'replica' and restart primary. The
slot is invalidated with 'wal_level_insufficient'
4. change wal_level of primary to 'logical' and restart primary.
5. In primary insert some records and run checkpoint. Also run a
checkpoint on standby. So, some initial wal files are removed.
6. Now copy the logical replication slot created in step 2. Then we
can hit the assert.
I agree that backpatching the patch can resolve this as it prevents
copying of invalidated slots.
I have attached the following patches:
v5-0001 : for HEAD
v5_PG_17_PG_16-0001 : for PG17 and PG16
Thanks and Regards,
Shlok Kyal
Attachments:
v5-0001-Restrict-copying-of-invalidated-replication-slots.patchapplication/octet-stream; name=v5-0001-Restrict-copying-of-invalidated-replication-slots.patchDownload
From aa7b35e279a772b63470d11aa7e28e62a29063d4 Mon Sep 17 00:00:00 2001
From: Shlok Kyal <shlok.kyal.oss@gmail.com>
Date: Tue, 4 Feb 2025 14:58:32 +0530
Subject: [PATCH v5] Restrict copying of invalidated replication slots
Currently we can copy invalidated logical and physical replication slots
using functions 'pg_copy_logical_replication_slot' and
'pg_copy_physical_replication_slot' respectively.
With this patch we will throw an error in such cases.
---
doc/src/sgml/func.sgml | 4 +++-
src/backend/replication/slotfuncs.c | 16 ++++++++++++++++
.../recovery/t/035_standby_logical_decoding.pl | 9 +++++++++
3 files changed, 28 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 9f60a476eb3..5ff04976862 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29352,7 +29352,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
The copied physical slot starts to reserve WAL from the same <acronym>LSN</acronym> as the
source slot.
<parameter>temporary</parameter> is optional. If <parameter>temporary</parameter>
- is omitted, the same value as the source slot is used.
+ is omitted, the same value as the source slot is used. Copy of an
+ invalidated slot is not allowed.
</para></entry>
</row>
@@ -29374,6 +29375,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
from the same <acronym>LSN</acronym> as the source logical slot. Both
<parameter>temporary</parameter> and <parameter>plugin</parameter> are
optional; if they are omitted, the values of the source slot are used.
+ Copy of an invalidated slot is not allowed.
</para></entry>
</row>
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index f652ec8a73e..18adf72cf7e 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -604,6 +604,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
ReplicationSlot second_slot_contents;
XLogRecPtr src_restart_lsn;
bool src_islogical;
+ bool src_isinvalidated;
bool temporary;
char *plugin;
Datum values[2];
@@ -661,6 +662,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
src_restart_lsn = first_slot_contents.data.restart_lsn;
temporary = (first_slot_contents.data.persistency == RS_TEMPORARY);
plugin = logical_slot ? NameStr(first_slot_contents.data.plugin) : NULL;
+ src_isinvalidated = (first_slot_contents.data.invalidated != RS_INVAL_NONE);
/* Check type of replication slot */
if (src_islogical != logical_slot)
@@ -678,6 +680,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot copy a replication slot that doesn't reserve WAL")));
+ /* Cannot copy an invalidated replication slot */
+ if (src_isinvalidated)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot copy invalidated replication slot \"%s\"",
+ NameStr(*src_name)));
+
/* Overwrite params from optional arguments */
if (PG_NARGS() >= 3)
temporary = PG_GETARG_BOOL(2);
@@ -774,6 +783,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
NameStr(*src_name)),
errhint("Retry when the source replication slot's confirmed_flush_lsn is valid.")));
+ /* Check if source slot became invalidated during the copy operation */
+ if (second_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errmsg("cannot copy replication slot \"%s\"",
+ NameStr(*src_name)),
+ errdetail("The source replication slot was invalidated during the copy operation."));
+
/* Install copied values again */
SpinLockAcquire(&MyReplicationSlot->mutex);
MyReplicationSlot->effective_xmin = copy_effective_xmin;
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 505e85d1eb6..23b235412ac 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -553,6 +553,15 @@ $handle =
check_pg_recvlogical_stderr($handle,
"can no longer access replication slot \"vacuum_full_activeslot\"");
+# Attempt to copy an invalidated logical replication slot
+($result, $stdout, $stderr) = $node_standby->psql(
+ 'postgres',
+ qq[select pg_copy_logical_replication_slot('vacuum_full_inactiveslot', 'vacuum_full_inactiveslot_copy');],
+ replication => 'database');
+ok( $stderr =~
+ /ERROR: cannot copy invalidated replication slot "vacuum_full_inactiveslot"/,
+ "invalidated slot cannot be copied");
+
# Turn hot_standby_feedback back on
change_hot_standby_feedback_and_wait_for_xmins(1, 1);
--
2.41.0.windows.3
v5_PG_17_PG_16-0001-Restrict-copying-of-invalidated-repli.patchapplication/octet-stream; name=v5_PG_17_PG_16-0001-Restrict-copying-of-invalidated-repli.patchDownload
From 914cf64fd8bdc247d77b7bea9a05903695df8d09 Mon Sep 17 00:00:00 2001
From: Shlok Kyal <shlok.kyal.oss@gmail.com>
Date: Mon, 24 Feb 2025 14:58:28 +0530
Subject: [PATCH v5_PG_17_PG_16] Restrict copying of invalidated replication
slots
Currently we can copy invalidated logical and physical replication slots
using functions 'pg_copy_logical_replication_slot' and
'pg_copy_physical_replication_slot' respectively.
With this patch we will throw an error in such cases.
---
doc/src/sgml/func.sgml | 4 +++-
src/backend/replication/slotfuncs.c | 16 ++++++++++++++++
.../recovery/t/035_standby_logical_decoding.pl | 9 +++++++++
3 files changed, 28 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 1c5e4bc5732..9acf198c03d 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29096,7 +29096,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
The copied physical slot starts to reserve WAL from the same <acronym>LSN</acronym> as the
source slot.
<parameter>temporary</parameter> is optional. If <parameter>temporary</parameter>
- is omitted, the same value as the source slot is used.
+ is omitted, the same value as the source slot is used. Copy of an
+ invalidated slot is not allowed.
</para></entry>
</row>
@@ -29118,6 +29119,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
from the same <acronym>LSN</acronym> as the source logical slot. Both
<parameter>temporary</parameter> and <parameter>plugin</parameter> are
optional; if they are omitted, the values of the source slot are used.
+ Copy of an invalidated slot is not allowed.
</para></entry>
</row>
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index dd6c1d5a7e3..7e92427f76e 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -607,6 +607,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
ReplicationSlot second_slot_contents;
XLogRecPtr src_restart_lsn;
bool src_islogical;
+ bool src_isinvalidated;
bool temporary;
char *plugin;
Datum values[2];
@@ -664,6 +665,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
src_restart_lsn = first_slot_contents.data.restart_lsn;
temporary = (first_slot_contents.data.persistency == RS_TEMPORARY);
plugin = logical_slot ? NameStr(first_slot_contents.data.plugin) : NULL;
+ src_isinvalidated = (first_slot_contents.data.invalidated != RS_INVAL_NONE);
/* Check type of replication slot */
if (src_islogical != logical_slot)
@@ -681,6 +683,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot copy a replication slot that doesn't reserve WAL")));
+ /* Cannot copy an invalidated replication slot */
+ if (src_isinvalidated)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot copy invalidated replication slot \"%s\"",
+ NameStr(*src_name)));
+
/* Overwrite params from optional arguments */
if (PG_NARGS() >= 3)
temporary = PG_GETARG_BOOL(2);
@@ -777,6 +786,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
NameStr(*src_name)),
errhint("Retry when the source replication slot's confirmed_flush_lsn is valid.")));
+ /* Check if source slot became invalidated during the copy operation */
+ if (second_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errmsg("cannot copy replication slot \"%s\"",
+ NameStr(*src_name)),
+ errdetail("The source replication slot was invalidated during the copy operation."));
+
/* Install copied values again */
SpinLockAcquire(&MyReplicationSlot->mutex);
MyReplicationSlot->effective_xmin = copy_effective_xmin;
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 4628f9fb806..1d72512ee0d 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -545,6 +545,15 @@ check_pg_recvlogical_stderr($handle,
"can no longer get changes from replication slot \"vacuum_full_activeslot\""
);
+# Attempt to copy an invalidated logical replication slot
+($result, $stdout, $stderr) = $node_standby->psql(
+ 'postgres',
+ qq[select pg_copy_logical_replication_slot('vacuum_full_inactiveslot', 'vacuum_full_inactiveslot_copy');],
+ replication => 'database');
+ok( $stderr =~
+ /ERROR: cannot copy invalidated replication slot "vacuum_full_inactiveslot"/,
+ "invalidated slot cannot be copied");
+
# Turn hot_standby_feedback back on
change_hot_standby_feedback_and_wait_for_xmins(1, 1);
--
2.41.0.windows.3
On Sun, 23 Feb 2025 at 06:46, Peter Smith <smithpb2250@gmail.com> wrote:
Some review comments for patch v2-0001.
======
Commit message1.
Currently we can copy an invalidated logical and physical replication slot
using functions 'pg_copy_logical_replication_slot' and
'pg_copy_physical_replication_slot' respectively.
With this patch we will throw an error in such cases./we can copy an invalidated logical and physical replication slot/we
can copy invalidated logical and physical replication slots/
Updated the commit message
======
doc/src/sgml/func.sgmlpg_copy_physical_replication_slot:
2. - is omitted, the same value as the source slot is used. + is omitted, the same value as the source slot is used. Copy of an + invalidated physical replication slot in not allowed.Typo /in/is/
Also, IMO you don't need to say "physical replication slot" because it
is clear from the function's name.SUGGESTION
Copy of an invalidated slot is not allowed.
Fixed
~~~
pg_copy_logical_replication_slot:
3.
+ Copy of an invalidated logical replication slot in not allowed.Typo /in/is/
Also, IMO you don't need to say "logical replication slot" because it
is clear from the function's name.SUGGESTION
Copy of an invalidated slot is not allowed.
Fixed
======
src/backend/replication/slotfuncs.ccopy_replication_slot:
4. + /* Check if source slot was invalidated while copying of slot */ + SpinLockAcquire(&src->mutex); + first_slot_contents = *src; + SpinLockRelease(&src->mutex); + + src_isinvalidated = (first_slot_contents.data.invalidated != RS_INVAL_NONE); + + if (src_isinvalidated) + ereport(ERROR, + (errmsg("could not copy replication slot \"%s\"", + NameStr(*src_name)), + errdetail("The source replication slot was invalidated during the copy operation.")));4a.
We already know that it was not invalid the FIRST time we looked at
it, so IMO we only need to confirm that the SECOND look gives the same
answer. IOW, I thought the code should be like below. (AFAICT
Sawada-san's review says the same as this).Also, I think it is better to say "became invalidated" instead of "was
invalidated", to imply the state was known to be ok before the copy.SUGGESTION
+ /* Check if source slot became invalidated during the copy operation */ + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) + ereport(ERROR,~
4b.
Unnecessary parentheses in the ereport.~
4c.
There seems some weird mix of tense "cannot copy" versus "could not
copy" already in this file. But, maybe at least you can be consistent
within the patch and always say "cannot".
Fixed.
I have addressed the above comments in v5 patch [1]/messages/by-id/CANhcyEUHp6cRfaKf0ZqHCppCqpqzmsf5swpbnYGyRU+N+ihi=Q@mail.gmail.com.
[1]: /messages/by-id/CANhcyEUHp6cRfaKf0ZqHCppCqpqzmsf5swpbnYGyRU+N+ihi=Q@mail.gmail.com
Thanks and Regards,
Shlok Kyal
On Mon, Feb 24, 2025 at 3:06 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Sat, 22 Feb 2025 at 04:49, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Feb 21, 2025 at 4:30 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Fri, 21 Feb 2025 at 01:14, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Feb 19, 2025 at 3:46 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Tue, 18 Feb 2025 at 15:26, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:On Monday, February 17, 2025 7:31 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Thu, 13 Feb 2025 at 15:54, vignesh C <vignesh21@gmail.com> wrote:
On Tue, 4 Feb 2025 at 15:27, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
Hi,
Currently, we can copy an invalidated slot using the function
'pg_copy_logical_replication_slot'. As per the suggestion in the
thread [1], we should prohibit copying of such slots.I have created a patch to address the issue.
This patch does not fix all the copy_replication_slot scenarios completely, there is a very corner concurrency case where an invalidated slot still gets copied: + /* We should not copy invalidated replication slots */ + if (src_isinvalidated) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot copy an invalidated replication slot")));Consider the following scenario:
step 1) Set up streaming replication between the primary and standby nodes.
step 2) Create a logical replication slot (test1) on the standby node.
step 3) Have a breakpoint in InvalidatePossiblyObsoleteSlot if cause
is RS_INVAL_WAL_LEVEL, no need to hold other invalidation causes or
add a sleep in InvalidatePossiblyObsoleteSlot function like below:
if (cause == RS_INVAL_WAL_LEVEL)
{
while (bsleep)
sleep(1);
}
step 4) Reduce wal_level on the primary to replica and restart the primarynode.
step 5) SELECT 'copy' FROM pg_copy_logical_replication_slot('test1',
'test2'); -- It will wait till the lock held by
InvalidatePossiblyObsoleteSlot is released while trying to create a
slot.
step 6) Increase wal_level back to logical on the primary node and
restart the primary.
step 7) Now allow the invalidation to happen (continue the breakpoint
held at step 3), the replication control lock will be released and the
invalidated slot will be copiedAfter this:
postgres=# SELECT 'copy' FROM
pg_copy_logical_replication_slot('test1', 'test2'); ?column?
----------
copy
(1 row)-- The invalidated slot (test1) is copied successfully:
postgres=# select * from pg_replication_slots ;
slot_name | plugin | slot_type | datoid | database | temporary
| active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phas
e | inactive_since | conflicting |
invalidation_reason | failover | synced-----------+---------------+-----------+--------+----------+-----------+
--------+------------+------+--------------+-------------+---------------
------+------------+---------------+-----------+----------------------------------+-------------+----------------------
--+----------+--------test1 | test_decoding | logical | 5 | postgres | f
| f | | | 745 | 0/4029060 | 0/4029098
| lost | | f
| 2025-02-13 15:26:54.666725+05:30 | t |
wal_level_insufficient | f | f
test2 | test_decoding | logical | 5 | postgres | f
| f | | | 745 | 0/4029060 | 0/4029098
| reserved | | f
| 2025-02-13 15:30:30.477836+05:30 | f |
| f | f
(2 rows)-- A subsequent attempt to decode changes from the invalidated slot
(test2) fails:
postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL,
NULL);
WARNING: detected write past chunk end in TXN 0x5e77e6c6f300
ERROR: logical decoding on standby requires "wal_level" >= "logical"
on the primary-- Alternatively, the following error may occur:
postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL,
NULL);
WARNING: detected write past chunk end in TXN 0x582d1b2d6ef0
data
------------
BEGIN 744
COMMIT 744
(2 rows)This is an edge case that can occur under specific conditions
involving replication slot invalidation when there is a huge lag
between primary and standby.
There might be a similar concurrency case for wal_removed too.Hi Vignesh,
Thanks for reviewing the patch.
Thanks for updating the patch. I have a question related to it.
I have tested the above scenario and was able to reproduce it. I have fixed it in
the v2 patch.
Currently we are taking a shared lock on ReplicationSlotControlLock.
This issue can be resolved if we take an exclusive lock instead.
Thoughts?It's not clear to me why increasing the lock level can solve it, could you
elaborate a bit more on this ?In HEAD, InvalidateObsoleteReplicationSlots acquires a SHARED lock on
'ReplicationSlotControlLock'
Also in function 'copy_replication_slot' we take a SHARED lock on
'ReplicationSlotControlLock' during fetching of source slot.So, for the case described by Vignesh in [1], first
InvalidateObsoleteReplicationSlot is called and we hold a SHARED lock
on 'ReplicationSlotControlLock'. We are now holding the function using
the sleep
if (cause == RS_INVAL_WAL_LEVEL)
{
while (bsleep)
sleep(1);
}Now we create a copy of the slot since 'copy_replication_slot' takes
a SHARED lock on 'ReplicationSlotControlLock'. It will take the lock
and fetch the info of the source slot (the slot is not invalidated
till now). and the function 'copy_replication_slot' calls function
'create_logical_replication_slot' which takes a EXCLUSIVE lock on
ReplicationSlotControlLock and hence it will wait for function
InvalidateObsoleteReplicationSlot to release lock. Once the function
'InvalidateObsoleteReplicationSlot' releases the lock, the execution
of 'create_logical_replication_slot' continues and creates a copy of
the source slot.Now with the patch, 'copy_replication_slot' will take an EXCLUSIVE
lock on 'ReplicationSlotControlLock'. to fetch the slot info. Hence,
it will wait for the 'InvalidateObsoleteReplicationSlot' to release
the lock and then fetch the source slot info and try to create the
copied slot (which will fail as source slot is invalidated before we
fetch its info)Besides, do we need one more invalidated check in the following codes after
creating the slot ?/*
* Check if the source slot still exists and is valid. We regard it as
* invalid if the type of replication slot or name has been changed,
* or the restart_lsn either is invalid or has gone backward. (The
...This approach seems more feasible to me. It also resolves the issue
suggested by Vignesh in [1]. I have made changes for the same in v3
patch.I agree to check if the source slot got invalidated during the copy.
But why do we need to search the slot by the slot name again as
follows?+ /* Check if source slot was invalidated while copying of slot */ + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); + + for (int i = 0; i < max_replication_slots; i++) + { + ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i]; + + if (s->in_use && strcmp(NameStr(s->data.name), NameStr(*src_name)) == 0) + { + /* Copy the slot contents while holding spinlock */ + SpinLockAcquire(&s->mutex); + first_slot_contents = *s; + SpinLockRelease(&s->mutex); + src = s; + break; + } + } + + LWLockRelease(ReplicationSlotControlLock);I think 'src' already points to the source slot.
Hi Sawada san,
Thanks for reviewing the patch.
I have used the 'src' instead of iterating again. I have attached the
updated v4 patch.Thank you for updating the patch! I have one comment:
+ /* Check if source slot was invalidated while copying of slot */ + SpinLockAcquire(&src->mutex); + first_slot_contents = *src; + SpinLockRelease(&src->mutex);We don't need to copy the source slot contents again since we already
do as follows:/* Copy data of source slot again */
SpinLockAcquire(&src->mutex);
second_slot_contents = *src;
SpinLockRelease(&src->mutex);I think we can use second_slot_contents for that check.
I agree. I have updated the v5 patch to use second_slot_contents
I've investigated the slot invalidation and copying slots behaviors.
We cannot copy a slot if it doesn't reserve WAL, but IIUC the slot's
restart_lsn is not reset by slot invalidation due to other than
RS_INVAL_WAL_REMOVED. Therefore, it's possible that we copy a slot
invalidated by for example RS_INVAL_IDLE_TIMEOUT, and the copied
slot's restart_lsn might have already been removed, which ultimately
causes an assertion failure in ocpy_replication_slot():#ifdef USE_ASSERT_CHECKING
/* Check that the restart_lsn is available */
{
XLogSegNo segno;XLByteToSeg(copy_restart_lsn, segno, wal_segment_size);
Assert(XLogGetLastRemovedSegno() < segno);
}
#endifI think this issue exists from v16 or later, I've not tested yet
though. If my understanding is right, this patch has to be
backpatched.I have tested the above in HEAD, PG 17 and PG 16 and found that we can
hit the above ASSERT condition in all three branches. With the
following steps:
1. create a physical replication setup
2. In standby create a logical replication slot.
3. change wal_level of primary to 'replica' and restart primary. The
slot is invalidated with 'wal_level_insufficient'
4. change wal_level of primary to 'logical' and restart primary.
5. In primary insert some records and run checkpoint. Also run a
checkpoint on standby. So, some initial wal files are removed.
6. Now copy the logical replication slot created in step 2. Then we
can hit the assert.I agree that backpatching the patch can resolve this as it prevents
copying of invalidated slots.I have attached the following patches:
v5-0001 : for HEAD
v5_PG_17_PG_16-0001 : for PG17 and PG16
I've checked if this issue exists also on v15 or older, but IIUC it
doesn't exist, fortunately. Here is the summary:
Scenario-1: the source gets invalidated before the copy function
fetches its contents for the first time. In this case, since the
source slot's restart_lsn is already an invalid LSN it raises an error
appropriately. In v15, we have only one slot invaldation reason, WAL
removal, therefore we always reset the slot's restart_lsn to
InvalidXlogRecPtr.
Scenario-2: the source gets invalidated before the copied slot is
created (i.e., between first content copy and
create_logical/physical_replication_slot()). In this case, the copied
slot could have a valid restart_lsn value that however might point to
a WAL segment that might have already been removed. However, since
copy_restart_lsn will be an invalid LSN (=0), it's caught by the
following if condition:
if (copy_restart_lsn < src_restart_lsn ||
src_islogical != copy_islogical ||
strcmp(copy_name, NameStr(*src_name)) != 0)
ereport(ERROR,
(errmsg("could not copy replication slot \"%s\"",
NameStr(*src_name)),
errdetail("The source replication slot was
modified incompatibly during the copy operation.")));
Scenario-3: the source gets invalidated after creating the copied slot
(i.e. after create_logical/physical_replication_slot()). In this case,
since the newly copied slot have the same restart_lsn as the source
slot, both slots are invalidated.
If the above analysis is right, I think the patches are mostly ready.
I've made some changes to the patches:
- removed src_isinvalidated variable as it's not necessarily necessary.
- updated the commit message.
Please review them. If there are no further comments on these patches,
I'm going to push them.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Attachments:
master_v6-0001-Restrict-copying-of-invalidated-replication-slots.patchapplication/octet-stream; name=master_v6-0001-Restrict-copying-of-invalidated-replication-slots.patchDownload
From 776c6128ded92aaadfadb172d1a629c9e3062bfd Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 24 Feb 2025 10:21:12 -0800
Subject: [PATCH v6] Restrict copying of invalidated replication slots.
Previously, invalidated logical and physical replication slots could
be copied using the pg_copy_logical_replication_slot and
pg_copy_physical_replication_slot functions. Replication slots
that were invalidated for reasons other than WAL removal retained their
restart_lsn. This meant that a new slot copied from an invalidated
slot could have a restart_lsn pointing to a WAL segment that might
have already been removed.
This commit restricts the copying of invalidated replication slots.
Backpatch to v16, where slots could retain their restart_lsn when
invalidated for reasons other than WAL removal.
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CANhcyEU65aH0VYnLiu%3DOhNNxhnhNhwcXBeT-jvRe1OiJTo_Ayg%40mail.gmail.com
Backpatch-through: 16
---
doc/src/sgml/func.sgml | 4 +++-
src/backend/replication/slotfuncs.c | 14 ++++++++++++++
.../recovery/t/035_standby_logical_decoding.pl | 9 +++++++++
3 files changed, 26 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 9f60a476eb3..5ff04976862 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29352,7 +29352,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
The copied physical slot starts to reserve WAL from the same <acronym>LSN</acronym> as the
source slot.
<parameter>temporary</parameter> is optional. If <parameter>temporary</parameter>
- is omitted, the same value as the source slot is used.
+ is omitted, the same value as the source slot is used. Copy of an
+ invalidated slot is not allowed.
</para></entry>
</row>
@@ -29374,6 +29375,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
from the same <acronym>LSN</acronym> as the source logical slot. Both
<parameter>temporary</parameter> and <parameter>plugin</parameter> are
optional; if they are omitted, the values of the source slot are used.
+ Copy of an invalidated slot is not allowed.
</para></entry>
</row>
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index f652ec8a73e..c0cd30803a4 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -678,6 +678,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot copy a replication slot that doesn't reserve WAL")));
+ /* Cannot copy an invalidated replication slot */
+ if (first_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot copy invalidated replication slot \"%s\"",
+ NameStr(*src_name)));
+
/* Overwrite params from optional arguments */
if (PG_NARGS() >= 3)
temporary = PG_GETARG_BOOL(2);
@@ -774,6 +781,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
NameStr(*src_name)),
errhint("Retry when the source replication slot's confirmed_flush_lsn is valid.")));
+ /* Check if source slot became invalidated during the copy operation */
+ if (second_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errmsg("cannot copy replication slot \"%s\"",
+ NameStr(*src_name)),
+ errdetail("The source replication slot was invalidated during the copy operation."));
+
/* Install copied values again */
SpinLockAcquire(&MyReplicationSlot->mutex);
MyReplicationSlot->effective_xmin = copy_effective_xmin;
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 505e85d1eb6..23b235412ac 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -553,6 +553,15 @@ $handle =
check_pg_recvlogical_stderr($handle,
"can no longer access replication slot \"vacuum_full_activeslot\"");
+# Attempt to copy an invalidated logical replication slot
+($result, $stdout, $stderr) = $node_standby->psql(
+ 'postgres',
+ qq[select pg_copy_logical_replication_slot('vacuum_full_inactiveslot', 'vacuum_full_inactiveslot_copy');],
+ replication => 'database');
+ok( $stderr =~
+ /ERROR: cannot copy invalidated replication slot "vacuum_full_inactiveslot"/,
+ "invalidated slot cannot be copied");
+
# Turn hot_standby_feedback back on
change_hot_standby_feedback_and_wait_for_xmins(1, 1);
--
2.43.5
REL17_v6-0001-Restrict-copying-of-invalidated-replication-slots.patchapplication/octet-stream; name=REL17_v6-0001-Restrict-copying-of-invalidated-replication-slots.patchDownload
From 153aa362bdcca52632567c2de63ed091e85deb64 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 24 Feb 2025 10:21:12 -0800
Subject: [PATCH v6] Restrict copying of invalidated replication slots.
Previously, invalidated logical and physical replication slots could
be copied using the pg_copy_logical_replication_slot and
pg_copy_physical_replication_slot functions. Replication slots
that were invalidated for reasons other than WAL removal retained their
restart_lsn. This meant that a new slot copied from an invalidated
slot could have a restart_lsn pointing to a WAL segment that might
have already been removed.
This commit restricts the copying of invalidated replication slots.
Backpatch to v16, where slots could retain their restart_lsn when
invalidated for reasons other than WAL removal.
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CANhcyEU65aH0VYnLiu%3DOhNNxhnhNhwcXBeT-jvRe1OiJTo_Ayg%40mail.gmail.com
Backpatch-through: 16
---
doc/src/sgml/func.sgml | 4 +++-
src/backend/replication/slotfuncs.c | 14 ++++++++++++++
.../recovery/t/035_standby_logical_decoding.pl | 9 +++++++++
3 files changed, 26 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 1c5e4bc5732..9acf198c03d 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29096,7 +29096,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
The copied physical slot starts to reserve WAL from the same <acronym>LSN</acronym> as the
source slot.
<parameter>temporary</parameter> is optional. If <parameter>temporary</parameter>
- is omitted, the same value as the source slot is used.
+ is omitted, the same value as the source slot is used. Copy of an
+ invalidated slot is not allowed.
</para></entry>
</row>
@@ -29118,6 +29119,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
from the same <acronym>LSN</acronym> as the source logical slot. Both
<parameter>temporary</parameter> and <parameter>plugin</parameter> are
optional; if they are omitted, the values of the source slot are used.
+ Copy of an invalidated slot is not allowed.
</para></entry>
</row>
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index dd6c1d5a7e3..0079e2971c1 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -681,6 +681,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot copy a replication slot that doesn't reserve WAL")));
+ /* Cannot copy an invalidated replication slot */
+ if (first_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot copy invalidated replication slot \"%s\"",
+ NameStr(*src_name)));
+
/* Overwrite params from optional arguments */
if (PG_NARGS() >= 3)
temporary = PG_GETARG_BOOL(2);
@@ -777,6 +784,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
NameStr(*src_name)),
errhint("Retry when the source replication slot's confirmed_flush_lsn is valid.")));
+ /* Check if source slot became invalidated during the copy operation */
+ if (second_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errmsg("cannot copy replication slot \"%s\"",
+ NameStr(*src_name)),
+ errdetail("The source replication slot was invalidated during the copy operation."));
+
/* Install copied values again */
SpinLockAcquire(&MyReplicationSlot->mutex);
MyReplicationSlot->effective_xmin = copy_effective_xmin;
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 4628f9fb806..1d72512ee0d 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -545,6 +545,15 @@ check_pg_recvlogical_stderr($handle,
"can no longer get changes from replication slot \"vacuum_full_activeslot\""
);
+# Attempt to copy an invalidated logical replication slot
+($result, $stdout, $stderr) = $node_standby->psql(
+ 'postgres',
+ qq[select pg_copy_logical_replication_slot('vacuum_full_inactiveslot', 'vacuum_full_inactiveslot_copy');],
+ replication => 'database');
+ok( $stderr =~
+ /ERROR: cannot copy invalidated replication slot "vacuum_full_inactiveslot"/,
+ "invalidated slot cannot be copied");
+
# Turn hot_standby_feedback back on
change_hot_standby_feedback_and_wait_for_xmins(1, 1);
--
2.43.5
REL16_v6-0001-Restrict-copying-of-invalidated-replication-slots.patchapplication/octet-stream; name=REL16_v6-0001-Restrict-copying-of-invalidated-replication-slots.patchDownload
From 5160d595ace3d00db485919985985d4fea255e40 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 24 Feb 2025 10:21:12 -0800
Subject: [PATCH v6] Restrict copying of invalidated replication slots.
Previously, invalidated logical and physical replication slots could
be copied using the pg_copy_logical_replication_slot and
pg_copy_physical_replication_slot functions. Replication slots
that were invalidated for reasons other than WAL removal retained their
restart_lsn. This meant that a new slot copied from an invalidated
slot could have a restart_lsn pointing to a WAL segment that might
have already been removed.
This commit restricts the copying of invalidated replication slots.
Backpatch to v16, where slots could retain their restart_lsn when
invalidated for reasons other than WAL removal.
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CANhcyEU65aH0VYnLiu%3DOhNNxhnhNhwcXBeT-jvRe1OiJTo_Ayg%40mail.gmail.com
Backpatch-through: 16
---
doc/src/sgml/func.sgml | 4 +++-
src/backend/replication/slotfuncs.c | 14 ++++++++++++++
.../recovery/t/035_standby_logical_decoding.pl | 9 +++++++++
3 files changed, 26 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 042b225b920..de6d45d0285 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -27259,7 +27259,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
The copied physical slot starts to reserve WAL from the same <acronym>LSN</acronym> as the
source slot.
<parameter>temporary</parameter> is optional. If <parameter>temporary</parameter>
- is omitted, the same value as the source slot is used.
+ is omitted, the same value as the source slot is used. Copy of an
+ invalidated slot is not allowed.
</para></entry>
</row>
@@ -27281,6 +27282,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
from the same <acronym>LSN</acronym> as the source logical slot. Both
<parameter>temporary</parameter> and <parameter>plugin</parameter> are
optional; if they are omitted, the values of the source slot are used.
+ Copy of an invalidated slot is not allowed.
</para></entry>
</row>
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 6035cf48160..6c6da0268b0 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -756,6 +756,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot copy a replication slot that doesn't reserve WAL")));
+ /* Cannot copy an invalidated replication slot */
+ if (first_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot copy invalidated replication slot \"%s\"",
+ NameStr(*src_name)));
+
/* Overwrite params from optional arguments */
if (PG_NARGS() >= 3)
temporary = PG_GETARG_BOOL(2);
@@ -843,6 +850,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
NameStr(*src_name)),
errhint("Retry when the source replication slot's confirmed_flush_lsn is valid.")));
+ /* Check if source slot became invalidated during the copy operation */
+ if (second_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errmsg("cannot copy replication slot \"%s\"",
+ NameStr(*src_name)),
+ errdetail("The source replication slot was invalidated during the copy operation."));
+
/* Install copied values again */
SpinLockAcquire(&MyReplicationSlot->mutex);
MyReplicationSlot->effective_xmin = copy_effective_xmin;
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index e90191fe2aa..be290a4d5e0 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -525,6 +525,15 @@ check_pg_recvlogical_stderr($handle,
"can no longer get changes from replication slot \"vacuum_full_activeslot\""
);
+# Attempt to copy an invalidated logical replication slot
+($result, $stdout, $stderr) = $node_standby->psql(
+ 'postgres',
+ qq[select pg_copy_logical_replication_slot('vacuum_full_inactiveslot', 'vacuum_full_inactiveslot_copy');],
+ replication => 'database');
+ok( $stderr =~
+ /ERROR: cannot copy invalidated replication slot "vacuum_full_inactiveslot"/,
+ "invalidated slot cannot be copied");
+
# Turn hot_standby_feedback back on
change_hot_standby_feedback_and_wait_for_xmins(1, 1);
--
2.43.5
On Tue, 25 Feb 2025 at 01:03, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Feb 24, 2025 at 3:06 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Sat, 22 Feb 2025 at 04:49, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Feb 21, 2025 at 4:30 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Fri, 21 Feb 2025 at 01:14, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Feb 19, 2025 at 3:46 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Tue, 18 Feb 2025 at 15:26, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:On Monday, February 17, 2025 7:31 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Thu, 13 Feb 2025 at 15:54, vignesh C <vignesh21@gmail.com> wrote:
On Tue, 4 Feb 2025 at 15:27, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
Hi,
Currently, we can copy an invalidated slot using the function
'pg_copy_logical_replication_slot'. As per the suggestion in the
thread [1], we should prohibit copying of such slots.I have created a patch to address the issue.
This patch does not fix all the copy_replication_slot scenarios completely, there is a very corner concurrency case where an invalidated slot still gets copied: + /* We should not copy invalidated replication slots */ + if (src_isinvalidated) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot copy an invalidated replication slot")));Consider the following scenario:
step 1) Set up streaming replication between the primary and standby nodes.
step 2) Create a logical replication slot (test1) on the standby node.
step 3) Have a breakpoint in InvalidatePossiblyObsoleteSlot if cause
is RS_INVAL_WAL_LEVEL, no need to hold other invalidation causes or
add a sleep in InvalidatePossiblyObsoleteSlot function like below:
if (cause == RS_INVAL_WAL_LEVEL)
{
while (bsleep)
sleep(1);
}
step 4) Reduce wal_level on the primary to replica and restart the primarynode.
step 5) SELECT 'copy' FROM pg_copy_logical_replication_slot('test1',
'test2'); -- It will wait till the lock held by
InvalidatePossiblyObsoleteSlot is released while trying to create a
slot.
step 6) Increase wal_level back to logical on the primary node and
restart the primary.
step 7) Now allow the invalidation to happen (continue the breakpoint
held at step 3), the replication control lock will be released and the
invalidated slot will be copiedAfter this:
postgres=# SELECT 'copy' FROM
pg_copy_logical_replication_slot('test1', 'test2'); ?column?
----------
copy
(1 row)-- The invalidated slot (test1) is copied successfully:
postgres=# select * from pg_replication_slots ;
slot_name | plugin | slot_type | datoid | database | temporary
| active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phas
e | inactive_since | conflicting |
invalidation_reason | failover | synced-----------+---------------+-----------+--------+----------+-----------+
--------+------------+------+--------------+-------------+---------------
------+------------+---------------+-----------+----------------------------------+-------------+----------------------
--+----------+--------test1 | test_decoding | logical | 5 | postgres | f
| f | | | 745 | 0/4029060 | 0/4029098
| lost | | f
| 2025-02-13 15:26:54.666725+05:30 | t |
wal_level_insufficient | f | f
test2 | test_decoding | logical | 5 | postgres | f
| f | | | 745 | 0/4029060 | 0/4029098
| reserved | | f
| 2025-02-13 15:30:30.477836+05:30 | f |
| f | f
(2 rows)-- A subsequent attempt to decode changes from the invalidated slot
(test2) fails:
postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL,
NULL);
WARNING: detected write past chunk end in TXN 0x5e77e6c6f300
ERROR: logical decoding on standby requires "wal_level" >= "logical"
on the primary-- Alternatively, the following error may occur:
postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL,
NULL);
WARNING: detected write past chunk end in TXN 0x582d1b2d6ef0
data
------------
BEGIN 744
COMMIT 744
(2 rows)This is an edge case that can occur under specific conditions
involving replication slot invalidation when there is a huge lag
between primary and standby.
There might be a similar concurrency case for wal_removed too.Hi Vignesh,
Thanks for reviewing the patch.
Thanks for updating the patch. I have a question related to it.
I have tested the above scenario and was able to reproduce it. I have fixed it in
the v2 patch.
Currently we are taking a shared lock on ReplicationSlotControlLock.
This issue can be resolved if we take an exclusive lock instead.
Thoughts?It's not clear to me why increasing the lock level can solve it, could you
elaborate a bit more on this ?In HEAD, InvalidateObsoleteReplicationSlots acquires a SHARED lock on
'ReplicationSlotControlLock'
Also in function 'copy_replication_slot' we take a SHARED lock on
'ReplicationSlotControlLock' during fetching of source slot.So, for the case described by Vignesh in [1], first
InvalidateObsoleteReplicationSlot is called and we hold a SHARED lock
on 'ReplicationSlotControlLock'. We are now holding the function using
the sleep
if (cause == RS_INVAL_WAL_LEVEL)
{
while (bsleep)
sleep(1);
}Now we create a copy of the slot since 'copy_replication_slot' takes
a SHARED lock on 'ReplicationSlotControlLock'. It will take the lock
and fetch the info of the source slot (the slot is not invalidated
till now). and the function 'copy_replication_slot' calls function
'create_logical_replication_slot' which takes a EXCLUSIVE lock on
ReplicationSlotControlLock and hence it will wait for function
InvalidateObsoleteReplicationSlot to release lock. Once the function
'InvalidateObsoleteReplicationSlot' releases the lock, the execution
of 'create_logical_replication_slot' continues and creates a copy of
the source slot.Now with the patch, 'copy_replication_slot' will take an EXCLUSIVE
lock on 'ReplicationSlotControlLock'. to fetch the slot info. Hence,
it will wait for the 'InvalidateObsoleteReplicationSlot' to release
the lock and then fetch the source slot info and try to create the
copied slot (which will fail as source slot is invalidated before we
fetch its info)Besides, do we need one more invalidated check in the following codes after
creating the slot ?/*
* Check if the source slot still exists and is valid. We regard it as
* invalid if the type of replication slot or name has been changed,
* or the restart_lsn either is invalid or has gone backward. (The
...This approach seems more feasible to me. It also resolves the issue
suggested by Vignesh in [1]. I have made changes for the same in v3
patch.I agree to check if the source slot got invalidated during the copy.
But why do we need to search the slot by the slot name again as
follows?+ /* Check if source slot was invalidated while copying of slot */ + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); + + for (int i = 0; i < max_replication_slots; i++) + { + ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i]; + + if (s->in_use && strcmp(NameStr(s->data.name), NameStr(*src_name)) == 0) + { + /* Copy the slot contents while holding spinlock */ + SpinLockAcquire(&s->mutex); + first_slot_contents = *s; + SpinLockRelease(&s->mutex); + src = s; + break; + } + } + + LWLockRelease(ReplicationSlotControlLock);I think 'src' already points to the source slot.
Hi Sawada san,
Thanks for reviewing the patch.
I have used the 'src' instead of iterating again. I have attached the
updated v4 patch.Thank you for updating the patch! I have one comment:
+ /* Check if source slot was invalidated while copying of slot */ + SpinLockAcquire(&src->mutex); + first_slot_contents = *src; + SpinLockRelease(&src->mutex);We don't need to copy the source slot contents again since we already
do as follows:/* Copy data of source slot again */
SpinLockAcquire(&src->mutex);
second_slot_contents = *src;
SpinLockRelease(&src->mutex);I think we can use second_slot_contents for that check.
I agree. I have updated the v5 patch to use second_slot_contents
I've investigated the slot invalidation and copying slots behaviors.
We cannot copy a slot if it doesn't reserve WAL, but IIUC the slot's
restart_lsn is not reset by slot invalidation due to other than
RS_INVAL_WAL_REMOVED. Therefore, it's possible that we copy a slot
invalidated by for example RS_INVAL_IDLE_TIMEOUT, and the copied
slot's restart_lsn might have already been removed, which ultimately
causes an assertion failure in ocpy_replication_slot():#ifdef USE_ASSERT_CHECKING
/* Check that the restart_lsn is available */
{
XLogSegNo segno;XLByteToSeg(copy_restart_lsn, segno, wal_segment_size);
Assert(XLogGetLastRemovedSegno() < segno);
}
#endifI think this issue exists from v16 or later, I've not tested yet
though. If my understanding is right, this patch has to be
backpatched.I have tested the above in HEAD, PG 17 and PG 16 and found that we can
hit the above ASSERT condition in all three branches. With the
following steps:
1. create a physical replication setup
2. In standby create a logical replication slot.
3. change wal_level of primary to 'replica' and restart primary. The
slot is invalidated with 'wal_level_insufficient'
4. change wal_level of primary to 'logical' and restart primary.
5. In primary insert some records and run checkpoint. Also run a
checkpoint on standby. So, some initial wal files are removed.
6. Now copy the logical replication slot created in step 2. Then we
can hit the assert.I agree that backpatching the patch can resolve this as it prevents
copying of invalidated slots.I have attached the following patches:
v5-0001 : for HEAD
v5_PG_17_PG_16-0001 : for PG17 and PG16I've checked if this issue exists also on v15 or older, but IIUC it
doesn't exist, fortunately. Here is the summary:Scenario-1: the source gets invalidated before the copy function
fetches its contents for the first time. In this case, since the
source slot's restart_lsn is already an invalid LSN it raises an error
appropriately. In v15, we have only one slot invaldation reason, WAL
removal, therefore we always reset the slot's restart_lsn to
InvalidXlogRecPtr.Scenario-2: the source gets invalidated before the copied slot is
created (i.e., between first content copy and
create_logical/physical_replication_slot()). In this case, the copied
slot could have a valid restart_lsn value that however might point to
a WAL segment that might have already been removed. However, since
copy_restart_lsn will be an invalid LSN (=0), it's caught by the
following if condition:if (copy_restart_lsn < src_restart_lsn ||
src_islogical != copy_islogical ||
strcmp(copy_name, NameStr(*src_name)) != 0)
ereport(ERROR,
(errmsg("could not copy replication slot \"%s\"",
NameStr(*src_name)),
errdetail("The source replication slot was
modified incompatibly during the copy operation.")));Scenario-3: the source gets invalidated after creating the copied slot
(i.e. after create_logical/physical_replication_slot()). In this case,
since the newly copied slot have the same restart_lsn as the source
slot, both slots are invalidated.If the above analysis is right, I think the patches are mostly ready.
I've made some changes to the patches:- removed src_isinvalidated variable as it's not necessarily necessary.
- updated the commit message.Please review them. If there are no further comments on these patches,
I'm going to push them.
I have verified the above scenarios and I confirm the behaviour described.
I have a small doubt.
In PG_17 and PG_16 we can invalidate physical slots only for
'wal_removed' case [1]https://github.com/postgres/postgres/blob/7c906c5b46f8189a04e67bc550cba33dd3851b6e/src/backend/replication/slot.c#L1600. And copying of such slot already give error
'cannot copy a replication slot that doesn't reserve WAL'. So, in PG17
and PG16 should we check for invalidated source slot only if we are
copying logical slots?
For HEAD the changes looks fine to me as in HEAD we can invalidate
physical slots for 'wal_removed' and 'idle timeout'.
Thanks and Regards,
Shlok Kyal
On Tue, Feb 25, 2025 at 1:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've checked if this issue exists also on v15 or older, but IIUC it
doesn't exist, fortunately. Here is the summary:Scenario-1: the source gets invalidated before the copy function
fetches its contents for the first time. In this case, since the
source slot's restart_lsn is already an invalid LSN it raises an error
appropriately. In v15, we have only one slot invaldation reason, WAL
removal, therefore we always reset the slot's restart_lsn to
InvalidXlogRecPtr.Scenario-2: the source gets invalidated before the copied slot is
created (i.e., between first content copy and
create_logical/physical_replication_slot()). In this case, the copied
slot could have a valid restart_lsn value that however might point to
a WAL segment that might have already been removed. However, since
copy_restart_lsn will be an invalid LSN (=0), it's caught by the
following if condition:if (copy_restart_lsn < src_restart_lsn ||
src_islogical != copy_islogical ||
strcmp(copy_name, NameStr(*src_name)) != 0)
ereport(ERROR,
(errmsg("could not copy replication slot \"%s\"",
NameStr(*src_name)),
errdetail("The source replication slot was
modified incompatibly during the copy operation.")));Scenario-3: the source gets invalidated after creating the copied slot
(i.e. after create_logical/physical_replication_slot()). In this case,
since the newly copied slot have the same restart_lsn as the source
slot, both slots are invalidated.
Which part of the code will cover Scenario-3? Shouldn't we give ERROR
for Scenario-3 as well?
--
With Regards,
Amit Kapila.
On Tue, Feb 25, 2025 at 2:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Feb 25, 2025 at 1:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've checked if this issue exists also on v15 or older, but IIUC it
doesn't exist, fortunately. Here is the summary:Scenario-1: the source gets invalidated before the copy function
fetches its contents for the first time. In this case, since the
source slot's restart_lsn is already an invalid LSN it raises an error
appropriately. In v15, we have only one slot invaldation reason, WAL
removal, therefore we always reset the slot's restart_lsn to
InvalidXlogRecPtr.Scenario-2: the source gets invalidated before the copied slot is
created (i.e., between first content copy and
create_logical/physical_replication_slot()). In this case, the copied
slot could have a valid restart_lsn value that however might point to
a WAL segment that might have already been removed. However, since
copy_restart_lsn will be an invalid LSN (=0), it's caught by the
following if condition:if (copy_restart_lsn < src_restart_lsn ||
src_islogical != copy_islogical ||
strcmp(copy_name, NameStr(*src_name)) != 0)
ereport(ERROR,
(errmsg("could not copy replication slot \"%s\"",
NameStr(*src_name)),
errdetail("The source replication slot was
modified incompatibly during the copy operation.")));Scenario-3: the source gets invalidated after creating the copied slot
(i.e. after create_logical/physical_replication_slot()). In this case,
since the newly copied slot have the same restart_lsn as the source
slot, both slots are invalidated.Which part of the code will cover Scenario-3? Shouldn't we give ERROR
for Scenario-3 as well?
In scenario-3, the backend process executing
pg_copy_logical/physical_replication_slot() already holds the new
copied slot and its restart_lsn is the same or older than the source
slot's restart_lsn. Therefore, if the source slot is invalidated at
that timing, the copied slot is invalidated too, resulting in an error
by the backend.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Mon, Feb 24, 2025 at 10:09 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Tue, 25 Feb 2025 at 01:03, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've checked if this issue exists also on v15 or older, but IIUC it
doesn't exist, fortunately. Here is the summary:Scenario-1: the source gets invalidated before the copy function
fetches its contents for the first time. In this case, since the
source slot's restart_lsn is already an invalid LSN it raises an error
appropriately. In v15, we have only one slot invaldation reason, WAL
removal, therefore we always reset the slot's restart_lsn to
InvalidXlogRecPtr.Scenario-2: the source gets invalidated before the copied slot is
created (i.e., between first content copy and
create_logical/physical_replication_slot()). In this case, the copied
slot could have a valid restart_lsn value that however might point to
a WAL segment that might have already been removed. However, since
copy_restart_lsn will be an invalid LSN (=0), it's caught by the
following if condition:if (copy_restart_lsn < src_restart_lsn ||
src_islogical != copy_islogical ||
strcmp(copy_name, NameStr(*src_name)) != 0)
ereport(ERROR,
(errmsg("could not copy replication slot \"%s\"",
NameStr(*src_name)),
errdetail("The source replication slot was
modified incompatibly during the copy operation.")));Scenario-3: the source gets invalidated after creating the copied slot
(i.e. after create_logical/physical_replication_slot()). In this case,
since the newly copied slot have the same restart_lsn as the source
slot, both slots are invalidated.If the above analysis is right, I think the patches are mostly ready.
I've made some changes to the patches:- removed src_isinvalidated variable as it's not necessarily necessary.
- updated the commit message.Please review them. If there are no further comments on these patches,
I'm going to push them.I have verified the above scenarios and I confirm the behaviour described.
I have a small doubt.
In PG_17 and PG_16 we can invalidate physical slots only for
'wal_removed' case [1]. And copying of such slot already give error
'cannot copy a replication slot that doesn't reserve WAL'. So, in PG17
and PG16 should we check for invalidated source slot only if we are
copying logical slots?
Although your analysis is correct, I believe we should retain the
validation check. Even though invalidated physical slots in PostgreSQL
16 and 17always have an invalid restart_lsn, maintaining this check
would be harmless. Furthermore, I prefer to maintain consistency in
the codebase across back branches and the master branch rather than
introducing variations.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Tue, Feb 25, 2025 at 11:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Feb 25, 2025 at 2:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
Scenario-3: the source gets invalidated after creating the copied slot
(i.e. after create_logical/physical_replication_slot()). In this case,
since the newly copied slot have the same restart_lsn as the source
slot, both slots are invalidated.Which part of the code will cover Scenario-3? Shouldn't we give ERROR
for Scenario-3 as well?In scenario-3, the backend process executing
pg_copy_logical/physical_replication_slot() already holds the new
copied slot and its restart_lsn is the same or older than the source
slot's restart_lsn. Therefore, if the source slot is invalidated at
that timing, the copied slot is invalidated too, resulting in an error
by the backend.
AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with
this operation. So, isn't it possible that the source slot exists at
the later position in ReplicationSlotCtl->replication_slots and the
loop traversing slots is already ahead from the point where the newly
copied slot is created? If so, the newly created slot won't be marked
as invalid whereas the source slot will be marked as invalid. I agree
that even in such a case, at a later point, the newly created slot
will also be marked as invalid.
--
With Regards,
Amit Kapila.
On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Feb 25, 2025 at 11:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Feb 25, 2025 at 2:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
Scenario-3: the source gets invalidated after creating the copied slot
(i.e. after create_logical/physical_replication_slot()). In this case,
since the newly copied slot have the same restart_lsn as the source
slot, both slots are invalidated.Which part of the code will cover Scenario-3? Shouldn't we give ERROR
for Scenario-3 as well?In scenario-3, the backend process executing
pg_copy_logical/physical_replication_slot() already holds the new
copied slot and its restart_lsn is the same or older than the source
slot's restart_lsn. Therefore, if the source slot is invalidated at
that timing, the copied slot is invalidated too, resulting in an error
by the backend.AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with
this operation. So, isn't it possible that the source slot exists at
the later position in ReplicationSlotCtl->replication_slots and the
loop traversing slots is already ahead from the point where the newly
copied slot is created?
Good point. I think it's possible.
If so, the newly created slot won't be marked
as invalid whereas the source slot will be marked as invalid. I agree
that even in such a case, at a later point, the newly created slot
will also be marked as invalid.
The wal_status of the newly created slot would immediately become
'lost' and the next checkpoint will invalidate it. Do we need to do
something to deal with this case?
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Thu, Feb 27, 2025 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with
this operation. So, isn't it possible that the source slot exists at
the later position in ReplicationSlotCtl->replication_slots and the
loop traversing slots is already ahead from the point where the newly
copied slot is created?Good point. I think it's possible.
If so, the newly created slot won't be marked
as invalid whereas the source slot will be marked as invalid. I agree
that even in such a case, at a later point, the newly created slot
will also be marked as invalid.The wal_status of the newly created slot would immediately become
'lost' and the next checkpoint will invalidate it. Do we need to do
something to deal with this case?
+ /* Check if source slot became invalidated during the copy operation */
+ if (second_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errmsg("cannot copy replication slot \"%s\"",
+ NameStr(*src_name)),
+ errdetail("The source replication slot was invalidated during the
copy operation."));
How about adding a similar test as above for MyReplicationSlot? That
should suffice the need because we have already acquired the new slot
by this time and invalidation should signal this process before
marking the new slot as invalid.
--
With Regards,
Amit Kapila.
On Thu, Feb 27, 2025 at 12:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Feb 27, 2025 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with
this operation. So, isn't it possible that the source slot exists at
the later position in ReplicationSlotCtl->replication_slots and the
loop traversing slots is already ahead from the point where the newly
copied slot is created?Good point. I think it's possible.
If so, the newly created slot won't be marked
as invalid whereas the source slot will be marked as invalid. I agree
that even in such a case, at a later point, the newly created slot
will also be marked as invalid.The wal_status of the newly created slot would immediately become
'lost' and the next checkpoint will invalidate it. Do we need to do
something to deal with this case?+ /* Check if source slot became invalidated during the copy operation */ + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) + ereport(ERROR, + errmsg("cannot copy replication slot \"%s\"", + NameStr(*src_name)), + errdetail("The source replication slot was invalidated during the copy operation."));How about adding a similar test as above for MyReplicationSlot? That
should suffice the need because we have already acquired the new slot
by this time and invalidation should signal this process before
marking the new slot as invalid.
IIUC in the scenario you mentioned, the loop traversing slots already
passed the position of newly created slot in
ReplicationSlotCtl->replication_slots array, so
MyReplicationSlot->data.invalidated is still RS_INVAL_NONE, no?
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Fri, Feb 28, 2025 at 5:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Feb 27, 2025 at 12:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Feb 27, 2025 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with
this operation. So, isn't it possible that the source slot exists at
the later position in ReplicationSlotCtl->replication_slots and the
loop traversing slots is already ahead from the point where the newly
copied slot is created?Good point. I think it's possible.
If so, the newly created slot won't be marked
as invalid whereas the source slot will be marked as invalid. I agree
that even in such a case, at a later point, the newly created slot
will also be marked as invalid.The wal_status of the newly created slot would immediately become
'lost' and the next checkpoint will invalidate it. Do we need to do
something to deal with this case?+ /* Check if source slot became invalidated during the copy operation */ + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) + ereport(ERROR, + errmsg("cannot copy replication slot \"%s\"", + NameStr(*src_name)), + errdetail("The source replication slot was invalidated during the copy operation."));How about adding a similar test as above for MyReplicationSlot? That
should suffice the need because we have already acquired the new slot
by this time and invalidation should signal this process before
marking the new slot as invalid.IIUC in the scenario you mentioned, the loop traversing slots already
passed the position of newly created slot in
ReplicationSlotCtl->replication_slots array, so
MyReplicationSlot->data.invalidated is still RS_INVAL_NONE, no?
Right, I don't have a simpler solution for this apart from making it
somehow serialize this operation with
InvalidateObsoleteReplicationSlots(). I don't think we need to go that
far to handle the scenario being discussed. It is better to add a
comment for this race and why it won't harm.
--
With Regards,
Amit Kapila.
On Thu, Feb 27, 2025 at 7:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Feb 28, 2025 at 5:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Feb 27, 2025 at 12:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Feb 27, 2025 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with
this operation. So, isn't it possible that the source slot exists at
the later position in ReplicationSlotCtl->replication_slots and the
loop traversing slots is already ahead from the point where the newly
copied slot is created?Good point. I think it's possible.
If so, the newly created slot won't be marked
as invalid whereas the source slot will be marked as invalid. I agree
that even in such a case, at a later point, the newly created slot
will also be marked as invalid.The wal_status of the newly created slot would immediately become
'lost' and the next checkpoint will invalidate it. Do we need to do
something to deal with this case?+ /* Check if source slot became invalidated during the copy operation */ + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) + ereport(ERROR, + errmsg("cannot copy replication slot \"%s\"", + NameStr(*src_name)), + errdetail("The source replication slot was invalidated during the copy operation."));How about adding a similar test as above for MyReplicationSlot? That
should suffice the need because we have already acquired the new slot
by this time and invalidation should signal this process before
marking the new slot as invalid.IIUC in the scenario you mentioned, the loop traversing slots already
passed the position of newly created slot in
ReplicationSlotCtl->replication_slots array, so
MyReplicationSlot->data.invalidated is still RS_INVAL_NONE, no?Right, I don't have a simpler solution for this apart from making it
somehow serialize this operation with
InvalidateObsoleteReplicationSlots(). I don't think we need to go that
far to handle the scenario being discussed.
Agreed.
It is better to add a
comment for this race and why it won't harm.
+1
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Fri, 28 Feb 2025 at 08:56, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Feb 28, 2025 at 5:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Feb 27, 2025 at 12:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Feb 27, 2025 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with
this operation. So, isn't it possible that the source slot exists at
the later position in ReplicationSlotCtl->replication_slots and the
loop traversing slots is already ahead from the point where the newly
copied slot is created?Good point. I think it's possible.
If so, the newly created slot won't be marked
as invalid whereas the source slot will be marked as invalid. I agree
that even in such a case, at a later point, the newly created slot
will also be marked as invalid.The wal_status of the newly created slot would immediately become
'lost' and the next checkpoint will invalidate it. Do we need to do
something to deal with this case?+ /* Check if source slot became invalidated during the copy operation */ + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) + ereport(ERROR, + errmsg("cannot copy replication slot \"%s\"", + NameStr(*src_name)), + errdetail("The source replication slot was invalidated during the copy operation."));How about adding a similar test as above for MyReplicationSlot? That
should suffice the need because we have already acquired the new slot
by this time and invalidation should signal this process before
marking the new slot as invalid.IIUC in the scenario you mentioned, the loop traversing slots already
passed the position of newly created slot in
ReplicationSlotCtl->replication_slots array, so
MyReplicationSlot->data.invalidated is still RS_INVAL_NONE, no?Right, I don't have a simpler solution for this apart from making it
somehow serialize this operation with
InvalidateObsoleteReplicationSlots(). I don't think we need to go that
far to handle the scenario being discussed. It is better to add a
comment for this race and why it won't harm.
I tried to reproduce the scenario described using the following steps:
Here are the steps:
1. set max_replication_slots = 2
2. create two logical replication slot, lets say slot1 and slot2. drop
the first slot (slot1).
3. Now run pg_copy_logical_replication slot function on slot2. Lets
say copied slot is 'slot_cp'.
In function copy_replication_slot add breakpoint just after function
'create_logical_replication_slot'.
slot_cp will be created. In array
ReplicationSlotCtl->replication_slots, slot_cp will come before slot2.
4. Now invalidate the 'slot2'.
Function 'InvalidateObsoleteReplicationSlots' will be called. Now it
will loop through ReplicationSlotCtl->replication_slots and will get
'slot_cp' first. Since the slot is acquired by copy_replication_slot,
It will wait for the slot to be released. Once slot is released,
'slot_cp' and 'slot2' will be invalidated.
I have tried it with 'wal_removal' invalidation. So it is issue in
PG_13 to HEAD.
I also see a kill signal sent to function 'copy_replication_slot'.
Terminal output:
postgres=# select pg_copy_logical_replication_slot('slot2', 'slot_cp');
FATAL: terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
5. Execute 'SELECT * from pg_replication_slots'. slot_cp is present in
the list with wal_status as lost.
I have added the comment on existing patches for REL_16 to master.
Created a new patch to add only comments in REL13-REL15.
Thanks and Regards,
Shlok Kyal
Attachments:
REL_17_v7-0001-Restrict-copying-of-invalidated-replication-slots.patchapplication/octet-stream; name=REL_17_v7-0001-Restrict-copying-of-invalidated-replication-slots.patchDownload
From 0b76c2e0551601e9dfc08a1f9190b0552d4ca1dd Mon Sep 17 00:00:00 2001
From: Shlok Kyal <shlok.kyal.oss@gmail.com>
Date: Mon, 10 Mar 2025 09:19:38 +0530
Subject: [PATCH v7] Restrict copying of invalidated replication slots.
Previously, invalidated logical and physical replication slots could
be copied using the pg_copy_logical_replication_slot and
pg_copy_physical_replication_slot functions. Replication slots
that were invalidated for reasons other than WAL removal retained their
restart_lsn. This meant that a new slot copied from an invalidated
slot could have a restart_lsn pointing to a WAL segment that might
have already been removed.
This commit restricts the copying of invalidated replication slots.
Backpatch to v16, where slots could retain their restart_lsn when
invalidated for reasons other than WAL removal.
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CANhcyEU65aH0VYnLiu%3DOhNNxhnhNhwcXBeT-jvRe1OiJTo_Ayg%40mail.gmail.com
Backpatch-through: 16
---
doc/src/sgml/func.sgml | 7 +++--
src/backend/replication/slotfuncs.c | 31 ++++++++++++++++++-
.../t/035_standby_logical_decoding.pl | 9 ++++++
3 files changed, 44 insertions(+), 3 deletions(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index d8b27903593..d76730aeec1 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29096,7 +29096,9 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
The copied physical slot starts to reserve WAL from the same <acronym>LSN</acronym> as the
source slot.
<parameter>temporary</parameter> is optional. If <parameter>temporary</parameter>
- is omitted, the same value as the source slot is used.
+ is omitted, the same value as the source slot is used. Copy of an
+ invalidated slot is not allowed. Copy of an invalidated slot is not
+ allowed.
</para></entry>
</row>
@@ -29121,7 +29123,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
The <literal>failover</literal> option of the source logical slot
is not copied and is set to <literal>false</literal> by default. This
is to avoid the risk of being unable to continue logical replication
- after failover to standby where the slot is being synchronized.
+ after failover to standby where the slot is being synchronized. Copy of
+ an invalidated slot is not allowed.
</para></entry>
</row>
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 43c3cc7336b..106edbb0663 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -681,6 +681,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot copy a replication slot that doesn't reserve WAL")));
+ /* Cannot copy an invalidated replication slot */
+ if (first_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot copy invalidated replication slot \"%s\"",
+ NameStr(*src_name)));
+
/* Overwrite params from optional arguments */
if (PG_NARGS() >= 3)
temporary = PG_GETARG_BOOL(2);
@@ -690,7 +697,22 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
plugin = NameStr(*(PG_GETARG_NAME(3)));
}
- /* Create new slot and acquire it */
+ /*
+ * Create new slot and acquire it
+ *
+ * After slot is created there can be a race condition with function
+ * InvalidateObsoleteReplicationSlots, when the copied slot appear before
+ * source slot in array ReplicationSlotCtl->replication_slots.
+ *
+ * If just after slot creation, the source slot is invalidated due to
+ * some operation. The execution of InvalidateObsoleteReplicationSlots will
+ * wait for this function to release the copied slot. Then the source slot
+ * will be invalidated. So, the copying of source slot is successful in this
+ * case.
+ *
+ * The wal_status of copied slot is set to lost immediately and hence will
+ * not have any harm.
+ */
if (logical_slot)
{
/*
@@ -782,6 +804,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
NameStr(*src_name)),
errhint("Retry when the source replication slot's confirmed_flush_lsn is valid.")));
+ /* Check if source slot became invalidated during the copy operation */
+ if (second_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errmsg("cannot copy replication slot \"%s\"",
+ NameStr(*src_name)),
+ errdetail("The source replication slot was invalidated during the copy operation."));
+
/* Install copied values again */
SpinLockAcquire(&MyReplicationSlot->mutex);
MyReplicationSlot->effective_xmin = copy_effective_xmin;
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index aeb79f51e71..4eca17885d6 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -583,6 +583,15 @@ check_pg_recvlogical_stderr($handle,
"can no longer get changes from replication slot \"vacuum_full_activeslot\""
);
+# Attempt to copy an invalidated logical replication slot
+($result, $stdout, $stderr) = $node_standby->psql(
+ 'postgres',
+ qq[select pg_copy_logical_replication_slot('vacuum_full_inactiveslot', 'vacuum_full_inactiveslot_copy');],
+ replication => 'database');
+ok( $stderr =~
+ /ERROR: cannot copy invalidated replication slot "vacuum_full_inactiveslot"/,
+ "invalidated slot cannot be copied");
+
# Turn hot_standby_feedback back on
change_hot_standby_feedback_and_wait_for_xmins(1, 1);
--
2.34.1
REL_16_v7-0001-Restrict-copying-of-invalidated-replication-slots.patchapplication/octet-stream; name=REL_16_v7-0001-Restrict-copying-of-invalidated-replication-slots.patchDownload
From 599d429193c6810a049cd9a948f29bc10e9bccf2 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 24 Feb 2025 10:21:12 -0800
Subject: [PATCH v7] Restrict copying of invalidated replication slots.
Previously, invalidated logical and physical replication slots could
be copied using the pg_copy_logical_replication_slot and
pg_copy_physical_replication_slot functions. Replication slots
that were invalidated for reasons other than WAL removal retained their
restart_lsn. This meant that a new slot copied from an invalidated
slot could have a restart_lsn pointing to a WAL segment that might
have already been removed.
This commit restricts the copying of invalidated replication slots.
Backpatch to v16, where slots could retain their restart_lsn when
invalidated for reasons other than WAL removal.
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CANhcyEU65aH0VYnLiu%3DOhNNxhnhNhwcXBeT-jvRe1OiJTo_Ayg%40mail.gmail.com
Backpatch-through: 16
---
doc/src/sgml/func.sgml | 4 ++-
src/backend/replication/slotfuncs.c | 31 ++++++++++++++++++-
.../t/035_standby_logical_decoding.pl | 9 ++++++
3 files changed, 42 insertions(+), 2 deletions(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 042b225b920..de6d45d0285 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -27259,7 +27259,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
The copied physical slot starts to reserve WAL from the same <acronym>LSN</acronym> as the
source slot.
<parameter>temporary</parameter> is optional. If <parameter>temporary</parameter>
- is omitted, the same value as the source slot is used.
+ is omitted, the same value as the source slot is used. Copy of an
+ invalidated slot is not allowed.
</para></entry>
</row>
@@ -27281,6 +27282,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
from the same <acronym>LSN</acronym> as the source logical slot. Both
<parameter>temporary</parameter> and <parameter>plugin</parameter> are
optional; if they are omitted, the values of the source slot are used.
+ Copy of an invalidated slot is not allowed.
</para></entry>
</row>
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 6035cf48160..5eec788e88a 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -756,6 +756,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot copy a replication slot that doesn't reserve WAL")));
+ /* Cannot copy an invalidated replication slot */
+ if (first_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot copy invalidated replication slot \"%s\"",
+ NameStr(*src_name)));
+
/* Overwrite params from optional arguments */
if (PG_NARGS() >= 3)
temporary = PG_GETARG_BOOL(2);
@@ -765,7 +772,22 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
plugin = NameStr(*(PG_GETARG_NAME(3)));
}
- /* Create new slot and acquire it */
+ /*
+ * Create new slot and acquire it
+ *
+ * After slot is created there can be a race condition with function
+ * InvalidateObsoleteReplicationSlots, when the copied slot appear before
+ * source slot in array ReplicationSlotCtl->replication_slots.
+ *
+ * If just after slot creation, the source slot is invalidated due to
+ * some operation. The execution of InvalidateObsoleteReplicationSlots will
+ * wait for this function to release the copied slot. Then the source slot
+ * will be invalidated. So, the copying of source slot is successful in this
+ * case.
+ *
+ * The wal_status of copied slot is set to lost immediately and hence will
+ * not have any harm.
+ */
if (logical_slot)
{
/*
@@ -843,6 +865,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
NameStr(*src_name)),
errhint("Retry when the source replication slot's confirmed_flush_lsn is valid.")));
+ /* Check if source slot became invalidated during the copy operation */
+ if (second_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errmsg("cannot copy replication slot \"%s\"",
+ NameStr(*src_name)),
+ errdetail("The source replication slot was invalidated during the copy operation."));
+
/* Install copied values again */
SpinLockAcquire(&MyReplicationSlot->mutex);
MyReplicationSlot->effective_xmin = copy_effective_xmin;
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 8120dfc2132..82ad7ce0c2b 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -563,6 +563,15 @@ check_pg_recvlogical_stderr($handle,
"can no longer get changes from replication slot \"vacuum_full_activeslot\""
);
+# Attempt to copy an invalidated logical replication slot
+($result, $stdout, $stderr) = $node_standby->psql(
+ 'postgres',
+ qq[select pg_copy_logical_replication_slot('vacuum_full_inactiveslot', 'vacuum_full_inactiveslot_copy');],
+ replication => 'database');
+ok( $stderr =~
+ /ERROR: cannot copy invalidated replication slot "vacuum_full_inactiveslot"/,
+ "invalidated slot cannot be copied");
+
# Turn hot_standby_feedback back on
change_hot_standby_feedback_and_wait_for_xmins(1, 1);
--
2.34.1
REL_15-REL_13_v7-0001-Comment-race-condition-in-copy_replication_slot.patchapplication/octet-stream; name=REL_15-REL_13_v7-0001-Comment-race-condition-in-copy_replication_slot.patchDownload
From 475e235e051f8745bd55720370ba452a121d6b95 Mon Sep 17 00:00:00 2001
From: Shlok Kyal <shlok.kyal.oss@gmail.com>
Date: Mon, 10 Mar 2025 10:53:35 +0530
Subject: [PATCH v7] Comment race condition in copy_replication_slot
After copied slot is created there can be a race condition with function
InvalidateObsoleteReplicationSlots, when the copied slot appear before
source slot in array ReplicationSlotCtl->replication_slots.
If just after slot creation, the source slot is invalidated due to
some operation. The execution of InvalidateObsoleteReplicationSlots will
wait for copy_replication_slot function to release the copied slot. Then the
source slot will be invalidated. So, the copying of source slot is successful
in this case.
The wal_status of copied slot is set to lost immediately and hence will
not have any harm.
---
src/backend/replication/slotfuncs.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 16a35279037..2c710a9c180 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -756,7 +756,22 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
plugin = NameStr(*(PG_GETARG_NAME(3)));
}
- /* Create new slot and acquire it */
+ /*
+ * Create new slot and acquire it
+ *
+ * After slot is created there can be a race condition with function
+ * InvalidateObsoleteReplicationSlots, when the copied slot appear before
+ * source slot in array ReplicationSlotCtl->replication_slots.
+ *
+ * If just after slot creation, the source slot is invalidated due to
+ * some operation. The execution of InvalidateObsoleteReplicationSlots will
+ * wait for this function to release the copied slot. Then the source slot
+ * will be invalidated. So, the copying of source slot is successful in this
+ * case.
+ *
+ * The wal_status of copied slot is set to lost immediately and hence will
+ * not have any harm.
+ */
if (logical_slot)
{
/*
--
2.34.1
master_v7-0001-Restrict-copying-of-invalidated-replication-slots.patchapplication/octet-stream; name=master_v7-0001-Restrict-copying-of-invalidated-replication-slots.patchDownload
From 22760046a3b5dcc1e6123fa413d5ce86190724d0 Mon Sep 17 00:00:00 2001
From: Shlok Kyal <shlok.kyal.oss@gmail.com>
Date: Mon, 3 Mar 2025 12:11:42 +0530
Subject: [PATCH v7] Restrict copying of invalidated replication slots.
Previously, invalidated logical and physical replication slots could
be copied using the pg_copy_logical_replication_slot and
pg_copy_physical_replication_slot functions. Replication slots
that were invalidated for reasons other than WAL removal retained their
restart_lsn. This meant that a new slot copied from an invalidated
slot could have a restart_lsn pointing to a WAL segment that might
have already been removed.
This commit restricts the copying of invalidated replication slots.
Backpatch to v16, where slots could retain their restart_lsn when
invalidated for reasons other than WAL removal.
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CANhcyEU65aH0VYnLiu%3DOhNNxhnhNhwcXBeT-jvRe1OiJTo_Ayg%40mail.gmail.com
Backpatch-through: 16
---
doc/src/sgml/func.sgml | 6 ++--
src/backend/replication/slotfuncs.c | 31 ++++++++++++++++++-
.../t/035_standby_logical_decoding.pl | 9 ++++++
3 files changed, 43 insertions(+), 3 deletions(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 51dd8ad6571..cda85e811ed 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29486,7 +29486,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
The copied physical slot starts to reserve WAL from the same <acronym>LSN</acronym> as the
source slot.
<parameter>temporary</parameter> is optional. If <parameter>temporary</parameter>
- is omitted, the same value as the source slot is used.
+ is omitted, the same value as the source slot is used. Copy of an
+ invalidated slot is not allowed.
</para></entry>
</row>
@@ -29511,7 +29512,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
The <literal>failover</literal> option of the source logical slot
is not copied and is set to <literal>false</literal> by default. This
is to avoid the risk of being unable to continue logical replication
- after failover to standby where the slot is being synchronized.
+ after failover to standby where the slot is being synchronized. Copy of
+ an invalidated slot is not allowed.
</para></entry>
</row>
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 146eef5871e..999c14c920d 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -678,6 +678,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot copy a replication slot that doesn't reserve WAL")));
+ /* Cannot copy an invalidated replication slot */
+ if (first_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot copy invalidated replication slot \"%s\"",
+ NameStr(*src_name)));
+
/* Overwrite params from optional arguments */
if (PG_NARGS() >= 3)
temporary = PG_GETARG_BOOL(2);
@@ -687,7 +694,22 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
plugin = NameStr(*(PG_GETARG_NAME(3)));
}
- /* Create new slot and acquire it */
+ /*
+ * Create new slot and acquire it
+ *
+ * After slot is created there can be a race condition with function
+ * InvalidateObsoleteReplicationSlots, when the copied slot appear before
+ * source slot in array ReplicationSlotCtl->replication_slots.
+ *
+ * If just after slot creation, the source slot is invalidated due to
+ * some operation. The execution of InvalidateObsoleteReplicationSlots will
+ * wait for this function to release the copied slot. Then the source slot
+ * will be invalidated. So, the copying of source slot is successful in this
+ * case.
+ *
+ * The wal_status of copied slot is set to lost immediately and hence will
+ * not have any harm.
+ */
if (logical_slot)
{
/*
@@ -779,6 +801,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
NameStr(*src_name)),
errhint("Retry when the source replication slot's confirmed_flush_lsn is valid.")));
+ /* Check if source slot became invalidated during the copy operation */
+ if (second_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errmsg("cannot copy replication slot \"%s\"",
+ NameStr(*src_name)),
+ errdetail("The source replication slot was invalidated during the copy operation."));
+
/* Install copied values again */
SpinLockAcquire(&MyReplicationSlot->mutex);
MyReplicationSlot->effective_xmin = copy_effective_xmin;
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 8903177d883..44046e6a12a 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -591,6 +591,15 @@ $handle =
check_pg_recvlogical_stderr($handle,
"can no longer access replication slot \"vacuum_full_activeslot\"");
+# Attempt to copy an invalidated logical replication slot
+($result, $stdout, $stderr) = $node_standby->psql(
+ 'postgres',
+ qq[select pg_copy_logical_replication_slot('vacuum_full_inactiveslot', 'vacuum_full_inactiveslot_copy');],
+ replication => 'database');
+ok( $stderr =~
+ /ERROR: cannot copy invalidated replication slot "vacuum_full_inactiveslot"/,
+ "invalidated slot cannot be copied");
+
# Turn hot_standby_feedback back on
change_hot_standby_feedback_and_wait_for_xmins(1, 1);
--
2.34.1
On Mon, Mar 10, 2025 at 11:46 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
I tried to reproduce the scenario described using the following steps:
Here are the steps:
1. set max_replication_slots = 2
2. create two logical replication slot, lets say slot1 and slot2. drop
the first slot (slot1).3. Now run pg_copy_logical_replication slot function on slot2. Lets
say copied slot is 'slot_cp'.
In function copy_replication_slot add breakpoint just after function
'create_logical_replication_slot'.
slot_cp will be created. In array
ReplicationSlotCtl->replication_slots, slot_cp will come before slot2.4. Now invalidate the 'slot2'.
Function 'InvalidateObsoleteReplicationSlots' will be called. Now it
will loop through ReplicationSlotCtl->replication_slots and will get
'slot_cp' first. Since the slot is acquired by copy_replication_slot,
It will wait for the slot to be released. Once slot is released,
'slot_cp' and 'slot2' will be invalidated.
It is also possible that InvalidateObsoleteReplicationSlots() has
already past slot_cp location in which case it doesn't need to wait
for the backend.
I have changed the comments for the master branch patch. Do let me
know what you think of the attached change. If you agree with this,
then please prepare the patches for back-branches that reflect this
change.
--
With Regards,
Amit Kapila.
Attachments:
master_v7-amit.diff.txttext/plain; charset=US-ASCII; name=master_v7-amit.diff.txtDownload
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 999c14c920d..215b797664f 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -694,22 +694,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
plugin = NameStr(*(PG_GETARG_NAME(3)));
}
- /*
- * Create new slot and acquire it
- *
- * After slot is created there can be a race condition with function
- * InvalidateObsoleteReplicationSlots, when the copied slot appear before
- * source slot in array ReplicationSlotCtl->replication_slots.
- *
- * If just after slot creation, the source slot is invalidated due to
- * some operation. The execution of InvalidateObsoleteReplicationSlots will
- * wait for this function to release the copied slot. Then the source slot
- * will be invalidated. So, the copying of source slot is successful in this
- * case.
- *
- * The wal_status of copied slot is set to lost immediately and hence will
- * not have any harm.
- */
+ /* Create new slot and acquire it */
if (logical_slot)
{
/*
@@ -801,7 +786,14 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
NameStr(*src_name)),
errhint("Retry when the source replication slot's confirmed_flush_lsn is valid.")));
- /* Check if source slot became invalidated during the copy operation */
+ /*
+ * Copying an invalid slot doesn't make sense. Note that the source
+ * slot can become invalid after we creat the new slot and copy the
+ * data of source slot. This is possible because the operations in
+ * InvalidateObsoleteReplicationSlots() are not serialized with this
+ * function. Even though we can't detect such a case here, the copied
+ * slot will become invalid in the next checkpoint cycle.
+ */
if (second_slot_contents.data.invalidated != RS_INVAL_NONE)
ereport(ERROR,
errmsg("cannot copy replication slot \"%s\"",
On Sun, Mar 9, 2025 at 11:16 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Fri, 28 Feb 2025 at 08:56, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Feb 28, 2025 at 5:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Feb 27, 2025 at 12:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Feb 27, 2025 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with
this operation. So, isn't it possible that the source slot exists at
the later position in ReplicationSlotCtl->replication_slots and the
loop traversing slots is already ahead from the point where the newly
copied slot is created?Good point. I think it's possible.
If so, the newly created slot won't be marked
as invalid whereas the source slot will be marked as invalid. I agree
that even in such a case, at a later point, the newly created slot
will also be marked as invalid.The wal_status of the newly created slot would immediately become
'lost' and the next checkpoint will invalidate it. Do we need to do
something to deal with this case?+ /* Check if source slot became invalidated during the copy operation */ + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) + ereport(ERROR, + errmsg("cannot copy replication slot \"%s\"", + NameStr(*src_name)), + errdetail("The source replication slot was invalidated during the copy operation."));How about adding a similar test as above for MyReplicationSlot? That
should suffice the need because we have already acquired the new slot
by this time and invalidation should signal this process before
marking the new slot as invalid.IIUC in the scenario you mentioned, the loop traversing slots already
passed the position of newly created slot in
ReplicationSlotCtl->replication_slots array, so
MyReplicationSlot->data.invalidated is still RS_INVAL_NONE, no?Right, I don't have a simpler solution for this apart from making it
somehow serialize this operation with
InvalidateObsoleteReplicationSlots(). I don't think we need to go that
far to handle the scenario being discussed. It is better to add a
comment for this race and why it won't harm.I tried to reproduce the scenario described using the following steps:
Here are the steps:
1. set max_replication_slots = 2
2. create two logical replication slot, lets say slot1 and slot2. drop
the first slot (slot1).3. Now run pg_copy_logical_replication slot function on slot2. Lets
say copied slot is 'slot_cp'.
In function copy_replication_slot add breakpoint just after function
'create_logical_replication_slot'.
slot_cp will be created. In array
ReplicationSlotCtl->replication_slots, slot_cp will come before slot2.4. Now invalidate the 'slot2'.
Function 'InvalidateObsoleteReplicationSlots' will be called. Now it
will loop through ReplicationSlotCtl->replication_slots and will get
'slot_cp' first. Since the slot is acquired by copy_replication_slot,
It will wait for the slot to be released. Once slot is released,
'slot_cp' and 'slot2' will be invalidated.I have tried it with 'wal_removal' invalidation. So it is issue in
PG_13 to HEAD.
I also see a kill signal sent to function 'copy_replication_slot'.Terminal output:
postgres=# select pg_copy_logical_replication_slot('slot2', 'slot_cp');
FATAL: terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.5. Execute 'SELECT * from pg_replication_slots'. slot_cp is present in
the list with wal_status as lost.I have added the comment on existing patches for REL_16 to master.
Created a new patch to add only comments in REL13-REL15.
I think we don't need a patch adding only comments to v15 or older. We
can either write the fact in the commit message that v15 or older are
not affected fortunately or add an explicit check if the copied slot
doesn't have invalid restart_lsn.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Mon, 17 Mar 2025 at 12:18, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Mar 10, 2025 at 11:46 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
I tried to reproduce the scenario described using the following steps:
Here are the steps:
1. set max_replication_slots = 2
2. create two logical replication slot, lets say slot1 and slot2. drop
the first slot (slot1).3. Now run pg_copy_logical_replication slot function on slot2. Lets
say copied slot is 'slot_cp'.
In function copy_replication_slot add breakpoint just after function
'create_logical_replication_slot'.
slot_cp will be created. In array
ReplicationSlotCtl->replication_slots, slot_cp will come before slot2.4. Now invalidate the 'slot2'.
Function 'InvalidateObsoleteReplicationSlots' will be called. Now it
will loop through ReplicationSlotCtl->replication_slots and will get
'slot_cp' first. Since the slot is acquired by copy_replication_slot,
It will wait for the slot to be released. Once slot is released,
'slot_cp' and 'slot2' will be invalidated.It is also possible that InvalidateObsoleteReplicationSlots() has
already past slot_cp location in which case it doesn't need to wait
for the backend.I have changed the comments for the master branch patch. Do let me
know what you think of the attached change. If you agree with this,
then please prepare the patches for back-branches that reflect this
change.
This change looks good to me. I have prepared the patches with the
suggested changes for PG16, PG17 and master.
I have also addressed the comments by Sawada-san in [1]/messages/by-id/CAD21AoBtjj_xoH+pgHP_CKykj00fayjFqQ60W6az-Drj2+mOdQ@mail.gmail.com and added a
line in the commit message for PG15 and earlier version.
[1]: /messages/by-id/CAD21AoBtjj_xoH+pgHP_CKykj00fayjFqQ60W6az-Drj2+mOdQ@mail.gmail.com
Thanks and Regards,
Shlok Kyal
Attachments:
master_v8-0001-Restrict-copying-of-invalidated-replication-slots.patchapplication/octet-stream; name=master_v8-0001-Restrict-copying-of-invalidated-replication-slots.patchDownload
From d4aed47793c5afdcdddd48208e66a8b0ea126dc9 Mon Sep 17 00:00:00 2001
From: Shlok Kyal <shlok.kyal.oss@gmail.com>
Date: Mon, 3 Mar 2025 12:11:42 +0530
Subject: [PATCH v8] Restrict copying of invalidated replication slots.
Previously, invalidated logical and physical replication slots could
be copied using the pg_copy_logical_replication_slot and
pg_copy_physical_replication_slot functions. Replication slots
that were invalidated for reasons other than WAL removal retained their
restart_lsn. This meant that a new slot copied from an invalidated
slot could have a restart_lsn pointing to a WAL segment that might
have already been removed.
This commit restricts the copying of invalidated replication slots.
Backpatch to v16, where slots could retain their restart_lsn when
invalidated for reasons other than WAL removal.
For v15 and earlier, since we can invalidate the slot only for reason of
WAL removal and existing check handles the issue, this check is not
required.
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CANhcyEU65aH0VYnLiu%3DOhNNxhnhNhwcXBeT-jvRe1OiJTo_Ayg%40mail.gmail.com
Backpatch-through: 16
---
doc/src/sgml/func.sgml | 6 ++++--
src/backend/replication/slotfuncs.c | 21 +++++++++++++++++++
.../t/035_standby_logical_decoding.pl | 9 ++++++++
3 files changed, 34 insertions(+), 2 deletions(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 1c3810e1a04..c78eea429ef 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29503,7 +29503,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
The copied physical slot starts to reserve WAL from the same <acronym>LSN</acronym> as the
source slot.
<parameter>temporary</parameter> is optional. If <parameter>temporary</parameter>
- is omitted, the same value as the source slot is used.
+ is omitted, the same value as the source slot is used. Copy of an
+ invalidated slot is not allowed.
</para></entry>
</row>
@@ -29528,7 +29529,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
The <literal>failover</literal> option of the source logical slot
is not copied and is set to <literal>false</literal> by default. This
is to avoid the risk of being unable to continue logical replication
- after failover to standby where the slot is being synchronized.
+ after failover to standby where the slot is being synchronized. Copy of
+ an invalidated slot is not allowed.
</para></entry>
</row>
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 146eef5871e..6f15be35cf1 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -678,6 +678,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot copy a replication slot that doesn't reserve WAL")));
+ /* Cannot copy an invalidated replication slot */
+ if (first_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot copy invalidated replication slot \"%s\"",
+ NameStr(*src_name)));
+
/* Overwrite params from optional arguments */
if (PG_NARGS() >= 3)
temporary = PG_GETARG_BOOL(2);
@@ -779,6 +786,20 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
NameStr(*src_name)),
errhint("Retry when the source replication slot's confirmed_flush_lsn is valid.")));
+ /*
+ * Copying an invalid slot doesn't make sense. Note that the source
+ * slot can become invalid after we create the new slot and copy the
+ * data of source slot. This is possible because the operations in
+ * InvalidateObsoleteReplicationSlots() are not serialized with this
+ * function. Even though we can't detect such a case here, the copied
+ * slot will become invalid in the next checkpoint cycle.
+ */
+ if (second_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errmsg("cannot copy replication slot \"%s\"",
+ NameStr(*src_name)),
+ errdetail("The source replication slot was invalidated during the copy operation."));
+
/* Install copied values again */
SpinLockAcquire(&MyReplicationSlot->mutex);
MyReplicationSlot->effective_xmin = copy_effective_xmin;
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index c31cab06f1c..ee066626af7 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -591,6 +591,15 @@ $handle =
check_pg_recvlogical_stderr($handle,
"can no longer access replication slot \"vacuum_full_activeslot\"");
+# Attempt to copy an invalidated logical replication slot
+($result, $stdout, $stderr) = $node_standby->psql(
+ 'postgres',
+ qq[select pg_copy_logical_replication_slot('vacuum_full_inactiveslot', 'vacuum_full_inactiveslot_copy');],
+ replication => 'database');
+ok( $stderr =~
+ /ERROR: cannot copy invalidated replication slot "vacuum_full_inactiveslot"/,
+ "invalidated slot cannot be copied");
+
# Turn hot_standby_feedback back on
change_hot_standby_feedback_and_wait_for_xmins(1, 1);
--
2.34.1
REL_17_v8-0001-Restrict-copying-of-invalidated-replication-slots.patchapplication/octet-stream; name=REL_17_v8-0001-Restrict-copying-of-invalidated-replication-slots.patchDownload
From 4b67fd5f725de2a92524d1912779a52b61b04f5e Mon Sep 17 00:00:00 2001
From: Shlok Kyal <shlok.kyal.oss@gmail.com>
Date: Mon, 10 Mar 2025 09:19:38 +0530
Subject: [PATCH v8] Restrict copying of invalidated replication slots.
Previously, invalidated logical and physical replication slots could
be copied using the pg_copy_logical_replication_slot and
pg_copy_physical_replication_slot functions. Replication slots
that were invalidated for reasons other than WAL removal retained their
restart_lsn. This meant that a new slot copied from an invalidated
slot could have a restart_lsn pointing to a WAL segment that might
have already been removed.
This commit restricts the copying of invalidated replication slots.
Backpatch to v16, where slots could retain their restart_lsn when
invalidated for reasons other than WAL removal.
For v15 and earlier, since we can invalidate the slot only for reason of
WAL removal and existing check handles the issue, this check is not
required.
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CANhcyEU65aH0VYnLiu%3DOhNNxhnhNhwcXBeT-jvRe1OiJTo_Ayg%40mail.gmail.com
Backpatch-through: 16
---
doc/src/sgml/func.sgml | 7 +++++--
src/backend/replication/slotfuncs.c | 21 +++++++++++++++++++
.../t/035_standby_logical_decoding.pl | 9 ++++++++
3 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index d8b27903593..d76730aeec1 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29096,7 +29096,9 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
The copied physical slot starts to reserve WAL from the same <acronym>LSN</acronym> as the
source slot.
<parameter>temporary</parameter> is optional. If <parameter>temporary</parameter>
- is omitted, the same value as the source slot is used.
+ is omitted, the same value as the source slot is used. Copy of an
+ invalidated slot is not allowed. Copy of an invalidated slot is not
+ allowed.
</para></entry>
</row>
@@ -29121,7 +29123,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
The <literal>failover</literal> option of the source logical slot
is not copied and is set to <literal>false</literal> by default. This
is to avoid the risk of being unable to continue logical replication
- after failover to standby where the slot is being synchronized.
+ after failover to standby where the slot is being synchronized. Copy of
+ an invalidated slot is not allowed.
</para></entry>
</row>
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 43c3cc7336b..7e6c5801e77 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -681,6 +681,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot copy a replication slot that doesn't reserve WAL")));
+ /* Cannot copy an invalidated replication slot */
+ if (first_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot copy invalidated replication slot \"%s\"",
+ NameStr(*src_name)));
+
/* Overwrite params from optional arguments */
if (PG_NARGS() >= 3)
temporary = PG_GETARG_BOOL(2);
@@ -782,6 +789,20 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
NameStr(*src_name)),
errhint("Retry when the source replication slot's confirmed_flush_lsn is valid.")));
+ /*
+ * Copying an invalid slot doesn't make sense. Note that the source
+ * slot can become invalid after we creat the new slot and copy the
+ * data of source slot. This is possible because the operations in
+ * InvalidateObsoleteReplicationSlots() are not serialized with this
+ * function. Even though we can't detect such a case here, the copied
+ * slot will become invalid in the next checkpoint cycle.
+ */
+ if (second_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errmsg("cannot copy replication slot \"%s\"",
+ NameStr(*src_name)),
+ errdetail("The source replication slot was invalidated during the copy operation."));
+
/* Install copied values again */
SpinLockAcquire(&MyReplicationSlot->mutex);
MyReplicationSlot->effective_xmin = copy_effective_xmin;
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index aeb79f51e71..4eca17885d6 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -583,6 +583,15 @@ check_pg_recvlogical_stderr($handle,
"can no longer get changes from replication slot \"vacuum_full_activeslot\""
);
+# Attempt to copy an invalidated logical replication slot
+($result, $stdout, $stderr) = $node_standby->psql(
+ 'postgres',
+ qq[select pg_copy_logical_replication_slot('vacuum_full_inactiveslot', 'vacuum_full_inactiveslot_copy');],
+ replication => 'database');
+ok( $stderr =~
+ /ERROR: cannot copy invalidated replication slot "vacuum_full_inactiveslot"/,
+ "invalidated slot cannot be copied");
+
# Turn hot_standby_feedback back on
change_hot_standby_feedback_and_wait_for_xmins(1, 1);
--
2.34.1
REL_16_v8-0001-Restrict-copying-of-invalidated-replication-slots.patchapplication/octet-stream; name=REL_16_v8-0001-Restrict-copying-of-invalidated-replication-slots.patchDownload
From 21a12b0aa798087f22c9b759e89eba5bff4c8d14 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 24 Feb 2025 10:21:12 -0800
Subject: [PATCH v8] Restrict copying of invalidated replication slots.
Previously, invalidated logical and physical replication slots could
be copied using the pg_copy_logical_replication_slot and
pg_copy_physical_replication_slot functions. Replication slots
that were invalidated for reasons other than WAL removal retained their
restart_lsn. This meant that a new slot copied from an invalidated
slot could have a restart_lsn pointing to a WAL segment that might
have already been removed.
This commit restricts the copying of invalidated replication slots.
Backpatch to v16, where slots could retain their restart_lsn when
invalidated for reasons other than WAL removal.
For v15 and earlier, since we can invalidate the slot only for reason of
WAL removal and existing check handles the issue, this check is not
required.
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CANhcyEU65aH0VYnLiu%3DOhNNxhnhNhwcXBeT-jvRe1OiJTo_Ayg%40mail.gmail.com
Backpatch-through: 16
---
doc/src/sgml/func.sgml | 4 +++-
src/backend/replication/slotfuncs.c | 21 +++++++++++++++++++
.../t/035_standby_logical_decoding.pl | 9 ++++++++
3 files changed, 33 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 042b225b920..de6d45d0285 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -27259,7 +27259,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
The copied physical slot starts to reserve WAL from the same <acronym>LSN</acronym> as the
source slot.
<parameter>temporary</parameter> is optional. If <parameter>temporary</parameter>
- is omitted, the same value as the source slot is used.
+ is omitted, the same value as the source slot is used. Copy of an
+ invalidated slot is not allowed.
</para></entry>
</row>
@@ -27281,6 +27282,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
from the same <acronym>LSN</acronym> as the source logical slot. Both
<parameter>temporary</parameter> and <parameter>plugin</parameter> are
optional; if they are omitted, the values of the source slot are used.
+ Copy of an invalidated slot is not allowed.
</para></entry>
</row>
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 6035cf48160..612bdd99b52 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -756,6 +756,13 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot copy a replication slot that doesn't reserve WAL")));
+ /* Cannot copy an invalidated replication slot */
+ if (first_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot copy invalidated replication slot \"%s\"",
+ NameStr(*src_name)));
+
/* Overwrite params from optional arguments */
if (PG_NARGS() >= 3)
temporary = PG_GETARG_BOOL(2);
@@ -843,6 +850,20 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
NameStr(*src_name)),
errhint("Retry when the source replication slot's confirmed_flush_lsn is valid.")));
+ /*
+ * Copying an invalid slot doesn't make sense. Note that the source
+ * slot can become invalid after we creat the new slot and copy the
+ * data of source slot. This is possible because the operations in
+ * InvalidateObsoleteReplicationSlots() are not serialized with this
+ * function. Even though we can't detect such a case here, the copied
+ * slot will become invalid in the next checkpoint cycle.
+ */
+ if (second_slot_contents.data.invalidated != RS_INVAL_NONE)
+ ereport(ERROR,
+ errmsg("cannot copy replication slot \"%s\"",
+ NameStr(*src_name)),
+ errdetail("The source replication slot was invalidated during the copy operation."));
+
/* Install copied values again */
SpinLockAcquire(&MyReplicationSlot->mutex);
MyReplicationSlot->effective_xmin = copy_effective_xmin;
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 8120dfc2132..82ad7ce0c2b 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -563,6 +563,15 @@ check_pg_recvlogical_stderr($handle,
"can no longer get changes from replication slot \"vacuum_full_activeslot\""
);
+# Attempt to copy an invalidated logical replication slot
+($result, $stdout, $stderr) = $node_standby->psql(
+ 'postgres',
+ qq[select pg_copy_logical_replication_slot('vacuum_full_inactiveslot', 'vacuum_full_inactiveslot_copy');],
+ replication => 'database');
+ok( $stderr =~
+ /ERROR: cannot copy invalidated replication slot "vacuum_full_inactiveslot"/,
+ "invalidated slot cannot be copied");
+
# Turn hot_standby_feedback back on
change_hot_standby_feedback_and_wait_for_xmins(1, 1);
--
2.34.1
On Mon, 17 Mar 2025 at 22:57, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sun, Mar 9, 2025 at 11:16 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Fri, 28 Feb 2025 at 08:56, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Feb 28, 2025 at 5:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Feb 27, 2025 at 12:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Feb 27, 2025 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with
this operation. So, isn't it possible that the source slot exists at
the later position in ReplicationSlotCtl->replication_slots and the
loop traversing slots is already ahead from the point where the newly
copied slot is created?Good point. I think it's possible.
If so, the newly created slot won't be marked
as invalid whereas the source slot will be marked as invalid. I agree
that even in such a case, at a later point, the newly created slot
will also be marked as invalid.The wal_status of the newly created slot would immediately become
'lost' and the next checkpoint will invalidate it. Do we need to do
something to deal with this case?+ /* Check if source slot became invalidated during the copy operation */ + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) + ereport(ERROR, + errmsg("cannot copy replication slot \"%s\"", + NameStr(*src_name)), + errdetail("The source replication slot was invalidated during the copy operation."));How about adding a similar test as above for MyReplicationSlot? That
should suffice the need because we have already acquired the new slot
by this time and invalidation should signal this process before
marking the new slot as invalid.IIUC in the scenario you mentioned, the loop traversing slots already
passed the position of newly created slot in
ReplicationSlotCtl->replication_slots array, so
MyReplicationSlot->data.invalidated is still RS_INVAL_NONE, no?Right, I don't have a simpler solution for this apart from making it
somehow serialize this operation with
InvalidateObsoleteReplicationSlots(). I don't think we need to go that
far to handle the scenario being discussed. It is better to add a
comment for this race and why it won't harm.I tried to reproduce the scenario described using the following steps:
Here are the steps:
1. set max_replication_slots = 2
2. create two logical replication slot, lets say slot1 and slot2. drop
the first slot (slot1).3. Now run pg_copy_logical_replication slot function on slot2. Lets
say copied slot is 'slot_cp'.
In function copy_replication_slot add breakpoint just after function
'create_logical_replication_slot'.
slot_cp will be created. In array
ReplicationSlotCtl->replication_slots, slot_cp will come before slot2.4. Now invalidate the 'slot2'.
Function 'InvalidateObsoleteReplicationSlots' will be called. Now it
will loop through ReplicationSlotCtl->replication_slots and will get
'slot_cp' first. Since the slot is acquired by copy_replication_slot,
It will wait for the slot to be released. Once slot is released,
'slot_cp' and 'slot2' will be invalidated.I have tried it with 'wal_removal' invalidation. So it is issue in
PG_13 to HEAD.
I also see a kill signal sent to function 'copy_replication_slot'.Terminal output:
postgres=# select pg_copy_logical_replication_slot('slot2', 'slot_cp');
FATAL: terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.5. Execute 'SELECT * from pg_replication_slots'. slot_cp is present in
the list with wal_status as lost.I have added the comment on existing patches for REL_16 to master.
Created a new patch to add only comments in REL13-REL15.I think we don't need a patch adding only comments to v15 or older. We
can either write the fact in the commit message that v15 or older are
not affected fortunately or add an explicit check if the copied slot
doesn't have invalid restart_lsn.
I felt adding a message in the commit message would be sufficient. I
have made the changes for the same in [1]/messages/by-id/CANhcyEV+wwhquX3J9J5gKVdhirGFhNxDAXJArz3udp94vNsvPA@mail.gmail.com.
[1]: /messages/by-id/CANhcyEV+wwhquX3J9J5gKVdhirGFhNxDAXJArz3udp94vNsvPA@mail.gmail.com
Thanks and Regards,
Shlok Kyal
On Tue, Mar 18, 2025 at 2:28 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Mon, 17 Mar 2025 at 22:57, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sun, Mar 9, 2025 at 11:16 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Fri, 28 Feb 2025 at 08:56, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Feb 28, 2025 at 5:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Feb 27, 2025 at 12:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Feb 27, 2025 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with
this operation. So, isn't it possible that the source slot exists at
the later position in ReplicationSlotCtl->replication_slots and the
loop traversing slots is already ahead from the point where the newly
copied slot is created?Good point. I think it's possible.
If so, the newly created slot won't be marked
as invalid whereas the source slot will be marked as invalid. I agree
that even in such a case, at a later point, the newly created slot
will also be marked as invalid.The wal_status of the newly created slot would immediately become
'lost' and the next checkpoint will invalidate it. Do we need to do
something to deal with this case?+ /* Check if source slot became invalidated during the copy operation */ + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) + ereport(ERROR, + errmsg("cannot copy replication slot \"%s\"", + NameStr(*src_name)), + errdetail("The source replication slot was invalidated during the copy operation."));How about adding a similar test as above for MyReplicationSlot? That
should suffice the need because we have already acquired the new slot
by this time and invalidation should signal this process before
marking the new slot as invalid.IIUC in the scenario you mentioned, the loop traversing slots already
passed the position of newly created slot in
ReplicationSlotCtl->replication_slots array, so
MyReplicationSlot->data.invalidated is still RS_INVAL_NONE, no?Right, I don't have a simpler solution for this apart from making it
somehow serialize this operation with
InvalidateObsoleteReplicationSlots(). I don't think we need to go that
far to handle the scenario being discussed. It is better to add a
comment for this race and why it won't harm.I tried to reproduce the scenario described using the following steps:
Here are the steps:
1. set max_replication_slots = 2
2. create two logical replication slot, lets say slot1 and slot2. drop
the first slot (slot1).3. Now run pg_copy_logical_replication slot function on slot2. Lets
say copied slot is 'slot_cp'.
In function copy_replication_slot add breakpoint just after function
'create_logical_replication_slot'.
slot_cp will be created. In array
ReplicationSlotCtl->replication_slots, slot_cp will come before slot2.4. Now invalidate the 'slot2'.
Function 'InvalidateObsoleteReplicationSlots' will be called. Now it
will loop through ReplicationSlotCtl->replication_slots and will get
'slot_cp' first. Since the slot is acquired by copy_replication_slot,
It will wait for the slot to be released. Once slot is released,
'slot_cp' and 'slot2' will be invalidated.I have tried it with 'wal_removal' invalidation. So it is issue in
PG_13 to HEAD.
I also see a kill signal sent to function 'copy_replication_slot'.Terminal output:
postgres=# select pg_copy_logical_replication_slot('slot2', 'slot_cp');
FATAL: terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.5. Execute 'SELECT * from pg_replication_slots'. slot_cp is present in
the list with wal_status as lost.I have added the comment on existing patches for REL_16 to master.
Created a new patch to add only comments in REL13-REL15.I think we don't need a patch adding only comments to v15 or older. We
can either write the fact in the commit message that v15 or older are
not affected fortunately or add an explicit check if the copied slot
doesn't have invalid restart_lsn.I felt adding a message in the commit message would be sufficient. I
have made the changes for the same in [1].
Thank you for updating the patches. These patches look good to me. I'm
going to push these fixes to master, v17, and v16 branches tomorrow
unless there is further comment.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Wed, Apr 2, 2025 at 2:58 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Mar 18, 2025 at 2:28 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Mon, 17 Mar 2025 at 22:57, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sun, Mar 9, 2025 at 11:16 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Fri, 28 Feb 2025 at 08:56, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Feb 28, 2025 at 5:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Feb 27, 2025 at 12:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Feb 27, 2025 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with
this operation. So, isn't it possible that the source slot exists at
the later position in ReplicationSlotCtl->replication_slots and the
loop traversing slots is already ahead from the point where the newly
copied slot is created?Good point. I think it's possible.
If so, the newly created slot won't be marked
as invalid whereas the source slot will be marked as invalid. I agree
that even in such a case, at a later point, the newly created slot
will also be marked as invalid.The wal_status of the newly created slot would immediately become
'lost' and the next checkpoint will invalidate it. Do we need to do
something to deal with this case?+ /* Check if source slot became invalidated during the copy operation */ + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) + ereport(ERROR, + errmsg("cannot copy replication slot \"%s\"", + NameStr(*src_name)), + errdetail("The source replication slot was invalidated during the copy operation."));How about adding a similar test as above for MyReplicationSlot? That
should suffice the need because we have already acquired the new slot
by this time and invalidation should signal this process before
marking the new slot as invalid.IIUC in the scenario you mentioned, the loop traversing slots already
passed the position of newly created slot in
ReplicationSlotCtl->replication_slots array, so
MyReplicationSlot->data.invalidated is still RS_INVAL_NONE, no?Right, I don't have a simpler solution for this apart from making it
somehow serialize this operation with
InvalidateObsoleteReplicationSlots(). I don't think we need to go that
far to handle the scenario being discussed. It is better to add a
comment for this race and why it won't harm.I tried to reproduce the scenario described using the following steps:
Here are the steps:
1. set max_replication_slots = 2
2. create two logical replication slot, lets say slot1 and slot2. drop
the first slot (slot1).3. Now run pg_copy_logical_replication slot function on slot2. Lets
say copied slot is 'slot_cp'.
In function copy_replication_slot add breakpoint just after function
'create_logical_replication_slot'.
slot_cp will be created. In array
ReplicationSlotCtl->replication_slots, slot_cp will come before slot2.4. Now invalidate the 'slot2'.
Function 'InvalidateObsoleteReplicationSlots' will be called. Now it
will loop through ReplicationSlotCtl->replication_slots and will get
'slot_cp' first. Since the slot is acquired by copy_replication_slot,
It will wait for the slot to be released. Once slot is released,
'slot_cp' and 'slot2' will be invalidated.I have tried it with 'wal_removal' invalidation. So it is issue in
PG_13 to HEAD.
I also see a kill signal sent to function 'copy_replication_slot'.Terminal output:
postgres=# select pg_copy_logical_replication_slot('slot2', 'slot_cp');
FATAL: terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.5. Execute 'SELECT * from pg_replication_slots'. slot_cp is present in
the list with wal_status as lost.I have added the comment on existing patches for REL_16 to master.
Created a new patch to add only comments in REL13-REL15.I think we don't need a patch adding only comments to v15 or older. We
can either write the fact in the commit message that v15 or older are
not affected fortunately or add an explicit check if the copied slot
doesn't have invalid restart_lsn.I felt adding a message in the commit message would be sufficient. I
have made the changes for the same in [1].Thank you for updating the patches. These patches look good to me. I'm
going to push these fixes to master, v17, and v16 branches tomorrow
unless there is further comment.
Pushed (yesterday).
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com